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Stellingen 
behorende bij het proefschrift 
The Evolution of Type Theory in 
Logic and Mathematics 


van 


Twan Laan 


. Type-theorie is zich pas gaan vertakken nadat bleek dat vertakking in 
de type-theorie niet noodzakelijk is (dit proefschrift, omslag). 


. De aanduiding “simple” in “simple type theory” verwijst niet naar het 
feit dat de simply typed A-calculus een van de eenvoudigste getypeerde 
A-calculi is, maar duidt op een type-systeem dat geen vertakte types 
(“ramified types”) kent. Daarom zijn alle getypeerde A-calculi in de 
kubus van Barendregt voorbeelden van “simple type theories”. 


. Het toevoegen van een parameter-mechanisme aan Pure Type Systems 
is geen uitbreiding van dit framework maar een verfijning ervan (dit 
proefschrift, Hoofdstuk 6). 


. Het schrijven van een proefschrift is als het reduceren van een term 
in een termherschrijfsysteem dat niet noodzakelijk confluent of sterk 
normaliserend, maar (door de eindige levensduur van promovendi) wel 
zwak normaliserend is. 


. Een belangrijke eis aan een wetenschappelijk experiment is de herhaal- 
baarheid ervan. Daarom is het discutabel om informatica een weten- 
schap te noemen zolang één van de meest gegeven adviezen bij compu- 
ternukken luidt: “uitzetten, aanzetten, en kijken of-ie dan wel werkt”. 


. Het verkrijgen van een reisadvies voor een reis per openbaar vervoer is 
te vergelijken met het passen van kleding in een kledingwinkel en moet 
dus (net als het passen van kleding) gratis zijn. 


N 


10. 


. Om “de automobilist in het openbaar vervoer te krijgen” is tariefinte- 


gratie tussen openbaar vervoer en personenauto minstens zo belangrijk 
als tariefintegratie tussen de verschillende vormen van openbaar vervoer 
onderling. 


. Het feit dat er beambten zijn die belast zijn met de controle en afgifte 


van vervoerbewijzen van het openbaar vervoer en die ten onrechte som- 
mige zwembadabonnementen accepteren als geldig vervoerbewijs in be- 
paalde bussen, geeft aan dat er iets grondig mis is met de helderheid 
van de tariefstructuur (en daarmee: met de klantvriendelijkheid) van 
het Nederlandse openbaar vervoer. 


. Het verkrijgen van handtekeningen die noodzakelijk zijn voor het suc- 


cesvol afleggen van het bureaucratische, traject voorafgaande aan een 
promotieplechtigheid wordt aanmerkelijk vergemakkelijkt indien de eer- 
ste promotor tevens decaan van de betrokken faculteit is. 


Engels en Frans zijn twee talen die niet vloeiend uit één mond kunnen 
komen. 
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Introduction 


Nowadays, type theory has many applications and is used in a lot of different 
disciplines. Even within logic and mathematics, there are a lot of different 
type systems. They serve several purposes, and are formulated in various 
ways. In this thesis we present a formal framework in which various type 
theories can be described. This framework is an extension of an already 
existing framework. 

For the development of this framework, we follow the evolution of type 
theory throughout the past century. However, we do not only give a mere 
historical description. On the contrary: our goal is not to describe the 
various type systems that have been developed in their historical setting, 
but to present them in a modern framework. In this way it becomes clear 
how the various type systems are related to each other, even if originally 
those systems are described in very different ways. Moreover, we can make 
clear what is the essence, or the common basis, of the various modern type 
theories. 

The historical line in this thesis is, therefore, only part of our method 
of research, and definitely not a goal of our research. 


The approach 


Following the historical line from Frege (1879) to today, we are confronted 
with various type systems. Often, such a system has already been described 
in a modern framework, but the relation between the modern description 
and the original system has not always been made clear. This is particularly 
the case when the original system is quite far from the modern framework 
with respect to notation, level of formality and/or purpose. We will focus 
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on such type systems. We describe them within the framework in such a 
way that: 


e We respect the ideas and the philosophy underlying the original sys- 
tem; 


e We meet contemporary requirements on formality and accuracy. 


As basis for our framework we choose typed A-calculus, more specific 
the framework of Pure Type Systems. There are several reasons for this 
choice: 


e Many type systems have already been placed in this framework (see 
Example 5.2.4 of [5]); 


e PTSs meet contemporary requirements on formality and accuracy; 


e PTSs focus on the heart of type theory: functionalisation and instan- 
tiation (see below). This makes it possible to compare type systems 
in a very fundamental way, without being hindered by things that do 
not touch the heart of the matter; 


e Though PTSs focus on the heart of type theory, they are easily ex- 
tendible in several ways. There are already many extensions described 
in the literature: 

— PTSs with definitions, introduced in [114]; 
— PTSs with modalities, introduced in [18]; 


PTSs with sum types, see [6]; 
PTSs with quotient types, see [68], [6]; 
— PTSs with subset types, see [6]. 


Another extension (PTSs with parameters and parametric definitions) 
is studied in this thesis (see Chapter 6); 


e The meta-theory that has been developed for PTSs makes it easier to 
access, develop and compare meta-theoretic properties of the various 
original type systems. 
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By placing several systems in the PTS framework, we also find some 
omissions in this framework. In particular, there is no extension of PTSs 
with parameters, while parameters play an important role in the type sys- 
tems underlying the proof checker AUTOMATH [95]. Extending PTSs with 
parameters not only opens the possibility of placing AUTOMATH more ac- 
curately in the framework of PTSs, it makes it also possible to give a better 
classification of more modern type systems and their applications. 

In the above, we claimed that PTSs focus on the heart of type theory: 
functionalisation and instantiation. We now describe what we mean by 
functionalisation and instantiation, and why these two notions are in our 
opinion the heart of the matter. 


The heart of type theory 


The explicit and formal use of types (and thus an early form of what is 
presently called “type theory”) was originally intended to prevent the para- 
doxes that occurred in logic and mathematics at the end of the 19th and 
the beginning of the 20th century. But it was not the only method devel- 
oped for this purpose. Another tool was the fine-tuning of Cantor’s Set 
Theory [25, 26] by Zermelo [123]. The approach of type theory however, 
is completely different from the set-theoretical approach. Type theory fo- 
cuses on the notion of function in logic and mathematics, and throughout 
the history of type theory, functions have remained one of the main objects 
of study for type theorists. 

In the abstract theory of functions, there are only two important con- 
structions: functionalisation and instantiation. We now discuss both con- 
struction principles. 


Functionalisation 


Consider the mathematical expression 2 + 3. This expression indicates 
the addition of the number 2 to the number 3. Both 2 and 3 are fixed 
objects. But we cannot only think of the addition of 2 to 3, but also of 
the addition of any other number to 3. This means that we replace the 
object 2 by a symbol that denotes “any natural number”: a variable (say: 
x). This results in the expression x + 3. This expression does not denote 
one specific natural number, but if we replace x by a natural number, then 
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the resulting expression represents a natural number. This replacement- 
activity is similar for the various possibilities for x, and gives rise to an 
algorithm: We feed a natural number to the algorithm, the algorithm adds 
3 to that natural number, and returns the result to us. This algorithm is 
called a function. 

Notice that the function that returns x+3 whenever we assign a natural 
number to x is more than just the expression x + 3: The expression x + 
3 denotes some natural number, but the function denotes an algorithm. 
Moreover, the expression y + 3 is an expression that is different from x +3. ° 
We can have two different natural numbers in mind for x and y. But if we 
construct functions from x + 3 and y + 3 by the method described above, 
we obtain the same algorithm. 

There are various ways to denote the algorithm in the example above: 


e Frege ([45], 1879) simply wrote x+3, and did not make any difference 
between the algorithm and the expression x + 3;! 


e Russell ({121], 1910) was more clear on this point, and wrote + 3 to 
distinguish the algorithm from the expression x + 3; 


è Church ([28], 1932) used Ax.x + 3 where Russell wrote x + 3; 


e In the proof checker AUTOMATH ([95], 1968), the notation [x:N]x + 3 
is used; 


e In explicitly typed A-calculi (also known as A-calculi “in Church- 
style”) one usually writes Ax:N.x + 3. The x:N behind the X denotes 
that the algorithm requires “special food”. We cannot just feed it 
anything we want, it only eats natural numbers; 


« In many mathematical texts, we find the notation x => x + 3. 


Thus, the process of constructing a function can be split up into two 
parts: 
1. Abstraction from a subexpression. First, we replace an subexpression 


k in an expression f by a variable x, at one or more places where & 
occurs in f. Thus we obtain a new expression, f’; 


1Frege used the notation x(x-+3) for what is called the course-of-value of the algorithm 
above, but not for the algorithm itself. See Section 1bl. 
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2. Function construction. Then we construct the function Az.f’ that 
assigns f’[x:=a] to a value a that we feed it. Here, f’[z:=a] denotes 
f', in which a has been substituted for x. Sometimes, “substituted” 
denotes: “replaced” (as in A-calculus). In other systems, like Rus- 
sell’s Ramified Theory of Types, substitution is a more complicated 
operation (see Section 2a4). 


We call this process: Functionalisation. 
The first part ofthe functionalisation process, abstraction from a subex- 
pression, is already present in Frege’s Begriffsschrift: 


“If in an expression, |... | a simple or a compound sign has one 
or more occurrences and if we regard that sign as replaceable in 
all or some of these occurrences by something else (but every- 
where by the same thing), then we call the part that remains 
invariant in the expression a function, and the replaceable part 
the argument of the function.” 


(Begriffsschrift, Section 9) 


Frege, however, did not introduce separate notations for for example, the 
expression x + 3 and the function Ax.x + 3. Hence, Frege did not employ 
the second part of functionalisation, the function construction. 

However, both parts of the functionalisation process are present in Prin- 
cipia Mathematica by Whitehead and Russell [121]. The first part is rep- 
resented by the “vice versa” part of *9-14 below, and the combination of 
the first and the second part is represented by *9-15 below: 


“9.14. If ‘pr’ is significant, then if x is of the same type as a, 
‘pa’ is significant, and vice versa. 

*9-15. If, for some a, there is a proposition pa, then there is 
a function pê, and vice versa” 


(Principia Mathematica, p. 133) 


Here, yx denotes an expression in which a variable x occurs. Similarly, pa 
denotes an expression in which a sub-expression a occurs, and pf denotes 
the function (algorithm) that assigns the value pa to an argument a. 
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Both the Begriffsschrift and Principia Mathematica exclude so-called 
constant functions. A function like Ax.3 cannot be constructed in these 
theories, because the expression 3 does not contain a variable x. This class 
of functions can be obtained by weakening the procedure of abstraction of 
subexpressions of the functionalisation procedure: We can replace an object 
in an expression f by a variable x at zero places where this object occurs 
in f (the object does not even have to occur in f). If we then apply the 
second part of the functionalisation procedure, we can obtain a constant 
function like Ax.3. i 

Functions of more variables can be constructed by repeatedly applying 
functionalisation. This repetition process is often called “currying” after 
H.B. Curry, and is usually accredited to Schönfinkel ([109], 1924), but some 
of the ideas behind it are already present in Frege’s Begriffsschrift (1879): 


“If, given a function, we think of a sign? that was hitherto re- 
garded as not replaceable as being replaceable at some or all of 
its occurrences, then by adopting this conception we obtain a 
function that has a new argument in addition to those it had 
before.” 


(Begriffsschrift, Section 9) 


For Frege, this procedure of introducing several variables one by one results 
in the functions of more than one variable as used in ordinary mathematics. 
Schönfinkel’s procedure results in curried functions as we know them from 
A-calculus. 

In A-calculus, functionalisation focuses on the function construction. 
The abstraction from subexpressions can be omitted, as variables form the 
basic terms of the A-calculus. 

The notion of functionalisation in the works of Frege and Russell is inac- 
curate according to modern standards. There are many unexpected choices 
for the expression that is replaced by a variable in the first step of function 
construction. Examples will be given in Remark 2.10. In modern systems, 
which usually use A-calculus, the second step of function construction is 
very clear: From a term f we can construct a term Az.f. But also the 


?We can now regard a sign that previously was considered replaceable as replaceable 
also in those places in which up to this point it was considered fixed. [footnote by Frege] 
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first step can be made clear. Take again the expression 2 +3. This term 
is -equivalent to the \-term (Ax.x + 3)2, which can be regarded as the 
term Ax.x +3 (a function) applied to an argument, viz. 2. More precise, 
(Ax.x + 3)2 is a -expansion of 2 + 3. In this G-expansion, both the newly 
introduced variable, x, and the object that is replaced by x, namely 2, are 
present. They are linked via the A-abstractor. The term Ax.x+3 is a 
Junction which is applied to the argument 2. By removing the argument 2 
from (Ax.x + 3)2, we obtain the function Ax.x +3. More generally, we can 
construct a function from a A-term f by first taking a G-expansion (Ax.f’)a 
of f, and then removing the argument a. This is a much more precise 
description of functionalisation than the one that is given in the work of 
Frege. 

This does not mean that the work on functionalisation has finished 
now. There are several variants of functionalisation that have not yet been 
studied completely. See the section on special forms of functionalisation 
below. 


Instantiation 


Instantiation is the inverse process of functionalisation. It consists of apply- 
ing a function to an argument, and calculating the result of this application. 
As the function is an algorithm, it is prescribed how this calculation should 
be made. For example, if we instantiate the function Ax.x + 3 with the ar- 
gument 2, we first apply this function to 2, obtaining (Ax.x + 3)2, and then 
calculate the result via G-reduction: 2 +3. Just like the functionalisation 
process, the instantiation process has two phases: 


1. Application construction. Juxtaposing the function f to an argument 
a; the result is usually written as f(a) or fa and denotes an intended 
function application; 


2. Concretisation to a subexpression. Calculating the result of this in- 
tended application. f is a function, and therefore has been con- 
structed from an expression f’ with a free variable (say x). The 
calculation usually consists of a substitution of a for the free variable 
T. 


These phases are clearly visible in the A-calculus above. The calculation 
consists of removing the A-abstraction from Ax.x + 3, and substituting the 
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functionalisation 
abstraction from function 
a subexpression construction 
2+3 (Ax.x + 3)2 Ax.x +3 
concretisation to application 
a subexpression construction 
initialisation 


Figure 1: Functionalisation and instantiation are each others inverse 


argument 2 for the free variable that appears when the A-abstraction is 
removed. Moreover, it becomes clear that function construction is the in- 
verse procedure of application construction, and that abstraction from a 
subexpression is the inverse procedure of concretisation to a subexpression. 
See Figure 1. 

It is not always that simple. Sometimes, the function f to which we 
apply an argument a is not a concrete object, but only a variable. For 
example, look at the function Ay.zy that is applied to an argument 2. In 
that case, the instantiation cannot be carried out completely. We can apply 
the function to the argument, obtaining (Ay.zy)2, and this term -reduces 
to z2. As we do not have a concrete object as function, but only the variable 
z, we cannot proceed with this calculation. If we substitute a function for 
z (say: Ax.x + 3), we obtain (Ax.x + 3)2. Then we continue the calculation 
by substituting 2 for x in x + 3, obtaining 2 + 3. 

As with functionalisation, instantiation can now be precisely defined 
in terms of A-calculus. In the works of Frege and Russell, we do not find 
such a precise description of instantiation. The application construction is 
well-described (for instance in the “vice-versa” part of «9-15 in Principia 
Mathematica — see the quotation at page 5), but there is no precise defini- 
tion of the concretisation to subexpressions by means of substitution. This 
is not so very important as long as it is straightforward how the substitu- 
tion should be carried out. However, we will see that substitution is not a 
straightforward procedure in Principia Mathematica. 

The precise definition of substitution in A-calculus is due to Curry and 
Feys [38] (1958). However, we must remark that “precise” is a relative 
notion here. The presentations of functionalisation and instantiation that 
were given by Frege and Russell were very precise for those days. And the 
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definition of substitution by Curry and Feys in 1958 is not the last word 
to be said on the notion of substitution. Currently, there is quite some 
research on so-called explicit substitutions, which are refinements of the 
notion of substitution of Curry and Feys. See [15], [1], [71], [8]. 


Special forms of functionalisation 


The mechanisms of functionalisation and instantiation that are used in A- 
calculus are quite powerful, but have some disadvantages: 


e In A-calculus, the first step in the functionalisation process is not 
carried out. In particular, the functionalisation process in A-calculus 
does not show from which term (object) has been abstracted. This is 
contrary to the systems of Frege and Russell, where it is clear from 
the functionalisation process from which object has been abstracted. 


However, there are also modern functionalisation processes in which 
it is essential to remind the original term from which has been ab- 
stracted. A good example is the use of definitions. If an subexpression 
k occurs in an expression f, it is sometimes practical to introduce an 
abbreviation for k, for several reasons: 


— The syntactical representation of the object k may be long. This 
makes manipulations with f a time-consuming and memory- 
consuming task, in particular when k occurs several times in 
f. Abbreviating k can make manipulations with f easier; 


— The object k may represent a structure that is particularly in- 
teresting. Abbreviating k opens the possibility to introduce a 
significant name for k. This makes the expression easier to un- 
derstand for human beings. 


Abbreviating k can be seen as an functionalisation process: we replace 
all the occurrences of k by its definiendum (its name), and then have 
an unfolding algorithm that can be used to replace the definiendum by 
its definiens (that is: k) when the internal structure of k is needed in 
the term f. For example, if we develop the theory of natural numbers 
using the axioms of Peano, we have terms 0 (zero), SO (the successor 
of zero, or: one), S(SO) (two), S(S(SO)), etc. In this notation, the 
expression 2+ 3 looks like S(S0)+S(S(SO)). Introducing abbreviations 
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(1 for SO, 2 for S(SO), etc.) makes the term shorter and more clear. 
The definiens of 2 can be stored in some context, but it can also be 
stored directly in front of the term 2 + 3 like a A-abstraction. We 
then obtain a new term 


2=S(S0) IN 2 +3. 


Storing the definition in a context is usually done for definitions that 
are used at several places, in several expressions; storing the definition 
in front of a term usually takes place when the definition is important 
for that term only; 


e It is not always necessary to make “full functionalisation”. For in- 
stance, have a look at the axiom for natural induction. This axiom 
can be written as a function: 


AP.PO — Vn[Pn — P(Sn)] — Vn[Pn]. 


This function takes one argument: a predicate on natural numbers 
P. A mathematician usually is not interested in the axiom presented 
in the above formulation. Often he is interested in instantiations of 
the axiom only. Therefore, a mathematician may prefer the induction 
axiom in the form of an axiom scheme, depending on a parameter for 
the predicate P. The scheme itself is not part of the formal language, 
but all the instantiations of the scheme do. As the scheme itself 
is not part of. the language, this “parametric” presentation of the 
induction axiom is not as strong as the presentation with the A-term: 
The latter is part of the formal language, so it is possible to discuss 
the axiom within the formal language. Nevertheless, the parametric 
presentation occurs very often in practice (for instance, in applications 
and implementations of mathematics), and therefore deserves a closer 
study (see Chapter 6 of this thesis). 


Preliminaries 


We assume that the reader is more or less familiar with the basics of typed 
A-calculus and type theory. We give a survey of the most important topics 
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concerning typed A-calculus in Appendix A. General introductions to type 
theory are also available in the literature (e.g. [93], [99]). 

Pure Type Systems play an important role in the thesis, especially in 
the chapters 4-6. At the point where PTSs enter the work, in Section 4b, 
we give a short explanation of the various PTS-rules. Again, we refer to 
Appendix A for a short summary of the theory regarding PTSs. 


Overview of this thesis 


The thesis is divided into six chapters. 

In the first chapter we discuss the prehistory of type theory. That is, we 
study the way in which types implicitly occurred in logic and mathematics 
before there was an explicit theory of types. We pay special attention to 
the formalisation of logic that is made in Frege’s Begriffsschrift [45] and 
Grundgesetze der Arithmetik [48, 52], as in this system many basic ideas 
are presented that are later used in type theory. Moreover, the system of 
Grundgesetze der Arithmetik is the one for which Russell derives his famous 
paradox, and this paradox has been the reason for Russell to introduce the 
first theory of types. 

This first type theory is the subject of the second chapter. Whitehead 
and Russell present their theory, the Ramified Type Theory (RTT), in an 
informal way. Several rough descriptions of this theory have been given 
in the literature (see for instance [101], [64], [30] and [32]) but we present 
a formalisation of RTT that is directly based on the presentation of RTT 
in Whitehead and Russell’s Principia Mathematica ([121], 1910-12). The 
construction of this formalisation is not a simple task. Whitehead and Rus- 
sell do not present a clear syntax for their so-called propositional functions 
in [121], neither do they make a clear difference between syntax and se- 
mantics. We present a formal definition of propositional function that is 
faithful to the original ideas exposed in Principia Mathematica. A second 
technical problem is the notion of substitution, which is totally undefined 
in Principia Mathematica. The formalisation of the notion of propositional 
function makes it possible to express the notion of substitution of Prin- 
cipia Mathematica in terms of A-calculus. We use techniques from typed 
and untyped A-calculus to give a precise description of substitution, and 
to show that substitution is well-defined as long as we restrict ourselves to 
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well-typed propositional functions of RTT. 

In 1926, Ramsey [101] proposes an important simplification of RTT, 
the simple theory of types. This simple type theory has become the basis 
for many modern type systems, and for the simply typed A-calculus of 
Church [30]. The simplification consisted of the removal of one of the two 
hierarchies from the RTT. The hierarchy of types is maintained, while the 
hierarchy of orders is removed. In Chapter 3 we discuss this process of 
so-called deramification. An important observation of this Chapter is that 
though the orders do not occur in the mainstream of type theories, they 
still provide an important intuition for logicians. We show that there is 
a close link between the hierarchy of orders in RTT and the hierarchy of 
truths that was introduced by Kripke [78]. We also show that Kripke’s use 
of orders is more flexible than Russell’s, and that this is due to the fact that 
orders occur at the semantical level in Kripke’s theory, while they occur at 
the level of syntax in RTT. 

Though type theory clearly served as a method to prevent certain logical 
paradoxes, the logical system stood apart from the type system until the 
1950s. In Chapter 4 we study the ways in which logic can be included in a 
type system. The various methods are all based on the idea that the proof 
of a logical implication can be seen as a function. More precisely: A proof of 
the proposition A — B is implemented as a function that takes a proof of A 
as argument, and returns a proof of B. In this way, the proposition A > B 
can be seen as the type of all functions from (proofs of) A to (proofs of) B. 
Similarly, a proof of A — B becomes a term of type A — B. One callsthis 
principle: propositions as types, or: proofs as terms. Both expressions are 
abbreviated by PAT. We illustrate the principle by giving a description of 
RTT in a PAT style. 

One of the important applications of PAT is the mechanical verifica- 
tion of mathematical proofs. The first tool for such a verification was 
AUTOMATH. It was developed in 1968. The languages of the various AU- 
TOMATH systems have been studied intensively. In [5], a description of two 
of the most important systems within the framework of Pure Type Systems 
is given, but without any explanation. In Chapter 5 we study the original 
language of AUTOMATH and translate it to a PTS format. In doing so, we 
obtain descriptions similar to the ones in [5]. 

The description in Chapter 5 is precise, but does not take into account 
two important mechanisms of AUTOMATH: ‘the definition mechanism and 
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the parameter mechanism. Many other type systems use these mechanisms 
as well. This motivates us to extend the framework of PTSs with defini- 
tions and parameters in Chapter 6. As far as definitions are concerned, this 
extension is based on the PTSs with definitions that were introduced by 
Severi and Poll [114]. This extension results in a refinement of the frame- 
work of PTSs. In this refined framework, various modern type systems 
(like LF and ML) can be described in a more precise way than in the PTS 
framework without definitions and parameters. 


Chapter 1 


Prehistory 


In this chapter, we discuss the development of type theory before it was 
actually baptised. This may sound like a contradiction. But types have 
played an important (though not very apparent) role in mathematics even 
before the theory of types was explicitly introduced by Russell in 1908 
[108]. Moreover, knowledge of the development of logic and mathematics 
before 1908, and especially of the occurrence of the logical paradoxes at the 
turn of the century, provides insight in the way in which Russell and others 
formulated their theories of types. 

When the first formalisations of parts of mathematics and logic ap- 
peared, the types were left implicit. Cantor’s Set Theory [25, 26], Peano’s 
formalisation of the theory of natural numbers in [97], and Frege’s Be- 
griffsschrift [45] and Grundgesetze der Arithmetik [48, 52] did not have a 
formal type system. The type of an object is indicated by means of natural 
language (“Let a be a proposition”) or is taken for granted. Types were 
informally present in the background of these theories, but a formal repre- 
sentation of the types was not incorporated: one could say that they were 
separated from logic and mathematics. 

However, even without a formalisation of the notion of types, the intro- 
duction of formal language had considerable advantages in the description 
of mathematical notions. The formalisation made it easier to give a precise 
definition of important abstract concepts, like the concept of function. The 
precise formulation allowed for a generalisation of the notion of function 
to include not only functions that take numbers as an argument, and re- 
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turn a number, but also functions that can take and return other sorts of 
arguments (like propositions, but also functions). Unfortunately, this also 
allowed logical paradoxes to enter the formal theory, without the (informal) 
type mechanism being able to prevent that. 

In this chapter we first argue that types have always been present in 
mathematics, though probably nobody was aware of it before the end of 
the 19th century (Section la). We proceed by describing how the logical 
paradoxes entered the formal systems of Frege, Cantor and Peano in Section 
1b. 

The historical remarks in this chapter have been taken from various 
resources. The most important ones are [14], [37], [61}, [76], [99], [104] and 
[122]. 


la Paradox threats 


The most fundamental idea behind type theory is being able to distinguish 
between different classes of objects (types). Until the end of the 19th cen- 
tury it had hardly ever been necessary to make this ability explicit. The 
mathematical language itself was predominantly informal, and so was the 
use of classes of objects. 

It is, however, difficult to argue that there were no types before Rus- 
sell “invented” them in 1908. Already around 325 B.C., Euclid began his 
Elements [43] with the following primitive definitions: 


1. A point is that which has no part; 
2. A line is breadthless length. 


From these two basic notions of “point” and “line”, Euclid defined more 
complex notions, like the notion of “circle”: 


15. A circle is a plane figure contained by one line such that all the 
straight lines falling upon it from one point among those lying within 
the figure are equal to one another. 


At first sight, these three observations are mere definitions. But these 
three pieces of text do not only define the notions of point, line and circle, 
they also show that Euclid distinguished between points, lines and circles. 
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Throughout the Elements, Euclid always mentioned to which class an ob- 
ject belonged (the class of points, the class of lines, etc.). In doing so, he 
prevented undesired situations, like the intersection of two points (instead 
of two lines). 

Undesired results? Euclid himself would probably have said: impossible 
results. When talking of an intersection, intuition implicitly forced him 
to think about the type of the objects of which he wanted to construct the 
intersection. As the intersection of two points is not supported by intuition, 
he did not even try to undertake such a construction. 

Euclid’s attitude to, and implicit use of type theory was maintained by 
the mathematicians and logicians of the next twenty-one centuries. From 
the 19th century on, mathematical systems became less intuitive, for several 
reasons: 


1. The system itself is complex, or abstract. An example is the theory 
of convergence in real analysis; 


2. The system is a formal system, for example, the formalisation of logic 
in Frege’s Begriffsschrift; 


3. (In the second half of the 20th century:) It is not a human being work- 
ing with the system, but something with less intuition, in particular: 
a computer. 


We will call these three situations parador threats. In all these cases, there 
is not enough intuition to activate the (implicitly present) type theory to 
warn against an impossible situation. One proceeds to reason within the 
impossible situation and then obtains a result that may be wrong or para- 
doxical: an undesired situation. We mention examples related to the three 
situations above: 


ad 1. The controversial results on convergence of series in analysis obtained 
in the 17th and 18th century, due to lack of knowledge on what real 
numbers actually are; 


ad 2. The logical paradoxes that arose from self-application of functions. 
Self-application is intuitively impossible, but this is easily forgotten 
when working in a formal system in which such self-application can 
be expressed. The result is undesirable: a logical paradox; 
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ad 3. An untyped computer program may receive instructions from a not 
too watchful user to add the number 3 to the string four (instead of 
the number 4). The computer, unaware of the fact that four is not 
a number, starts his calculation. It is not programmed to handle the 
calculation of 3+four. The result of this calculation is unpredictable. 
The computer may 


e give an answer that is clearly wrong (for example, **); 
e give no answer at all; 


e give an answer that is not so clearly wrong (for example, 6). 
Especially the last situation is highly undesirable. 


The example ad 2 is the main subject of the next section. 


1b Paradox threats in formal systems 


In the 19th century, the need for a more precise style in mathematics arose. 
Controversial results had appeared in analysis. Many of these controversies 
were solved by the work of Cauchy. For instance, he introduced a precise 
definition of convergence in his Cours d’Analyse [27]. Due to the more exact 
definition of real numbers given by Dedekind [41], the rules for reasoning 
with real numbers became even more precise. 

In 1879, Frege published his Begriffsschrift [45], in which he presented 
the first formalisation of logic. Frege’s reasoning was uncommonly precise 
for those days. Until then, it had been possible to make mathematical and 
logical concepts more clear by textual refinement in the natural language 
in which they were described. Frege was not satisfied with this: 


“... I found the inadequacy of language to be an obstacle; no 
matter how unwieldy the expressions I was ready to accept, I 
was less and less able, as the relations became more and more 
complex, to attain the precision that my purpose required.” 


(Begriffsschrift, Preface) 
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Frege therefore presented a completely formal system, whose 


“first purpose is to provide us with the most reliable test of the 
validity of a chain of inferences and to point out every presup- 
position that tries to sneak in unnoticed, so that its origin can 
be investigated.” 


(Begriffsschrift, Preface) 


1b1 Functions and their course of values 


The introduction of a very general definition of function was the key to 
the formalisation of logic. Frege defined what we will call the Abstraction 
Principle: 


Abstraction Principle 1.1 


“If in an expression, |... ] a simple or a compound sign has 
one or more occurrences and if we regard that sign as replaceable 
in all or some of these occurrences by something else (but ev- 
erywhere by the same thing), then we call the part that remains 
invariant in the expression a function, and the replaceable part 
the argument of the function.” 


(Begriffsschrift, Section 9) 


Frege put no restrictions on what could play the role of an argument. An 
argument could be a number (as was the situation in analysis), but also a 
proposition, or a function. Similarly, the result of applying a function to 
an argument did not necessarily have to be a number. Functions of more 
than one argument were constructed by a method that is very close to the 
method presented by Schönfinkel [109] in 1924: 


Abstraction Principle 1.2 


“If, given a function, we think of a sign! that was hitherto re- 
garded as not replaceable as being replaceable at some or all of 


!We can now regard a sign that previously was considered replaceable as replaceable 
also in those places in which up to this point it was considered fixed. [footnote by Frege] 
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its occurrences, then by adopting this conception we obtain a 
function that has a new argument in addition to those it had 
before.” 


(Begriffsschrift, Section 9) 


With this definition of function, two of the three possible paradox threats 
mentioned on p. 16 occurred: 


1. The generalisation of the concept of function made the system more 
abstract and less intuitive. The fact that functions could have differ- 
ent types of arguments is at the basis of the Russell Paradox; 


2. Frege introduced a formal system instead of the informal systems that 
were used up till then. Type theory, that would be helpful in distin- 
guishing between the different types of arguments that a function 
might take, was left informal. 


So, Frege had to proceed with caution. And so he did, at this stage. He 
remarked that 


“if the [... ] letter [sign] occurs as a function sign, this circum- 
stance [should] be taken into account.” 


(Begriffsschrift, Section 11) 


This could be interpreted as if Frege was aware of some typing rule that 
does not allow to substitute functions for object variables or objects for 
function variables. In his paper Function and Concept [47], Frege more 
explicitly stated: 


“ Now just as functions are fundamentally different from ob- 
jects, so also functions whose arguments are and must be func- 
tions are fundamentally different from functions whose argu- 
ments are objects and cannot be anything else. I call the latter 
first-level, the former second-level.” 


(Function and Concept, pp. 26-27) 
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A few pages later he proceeded: 


“In regard to second-level functions with one argument, we must, 
make a distinction, according as the role of this argument can 
be played by a function of one or of two arguments.” 


(Function and Concept, p. 29) 


Therefore, we may safely conclude that Frege avoided the two paradox 
threats in the Begriffsschrift. In Function and Concept we even see that 
he was aware of the fact that making a difference between first-level and 
second-level objects is essential in preventing certain paradoxes: 


“The ontological proof of God’s existence suffers from the fallacy 
of treating existence as a first-level concept.” 


(Function and Concept, p. 27, footnote) 


The Begriffsschrift, however, was only a prelude to Frege’s writings. In 
Grundlagen der Arithmetik [46] he argued that mathematics can be seen 
as a branch of logic. In Grundgesetze der Arithmetik [48, 52] he actually 
described the elementary parts of arithmetics within an extension of the 
logical framework that was presented in the Begriffsschrift. 

Frege approached the paradox threats for a second time at the end 
of Section 2 of his Grundgesetze. There he defined the expression “the 
function (zx) has the same course-of-values as the function V(x)” by 


“the functions (x) and W(x) always have the same value for 
the same argument.” 


(Grundgesetze, p. 7) 


Note that functions ®(z) and U(x) may have equal courses-of-values even 
if they have different definitions. For instance, let ®(x) be zA-x, and U(x) 
be x «+ =z, for all propositions x. Then ®(x) = W(x) for all z. So (x) 
and V(z) are different functions, but have the same course-of-values. 
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Frege denoted the course-of-values of a function ®(x) by èb(e).” The 
definition of equal courses-of-values could therefore be expressed as 


2f(e) = èg(e) — Val f(a) = g(a). (1) 


In modern terminology, we could say that the functions ®(x) and U(z) 
have the same course-of-values if they have the same graph. 

Frege did not provide a satisfying intuition for the formal notion of 
course-of-values of a function. He treated courses-of-values as ordinary 
objects. As a consequence, a function that takes objects as arguments could 
have its own course-of-values as an argument. In modern terminology: a 
function that takes objects as arguments can have its own graph as an 
argument. All essential information of a function is contained in its graph. 
So intuitively, a system in which a function can be applied to its own graph 
should have similar possibilities as a system in which a function can be 
applied to itself. Frege excluded the paradox threats from his system by 
forbidding self-application, but due to his treatment of courses-of-values 
these threats were able to enter his system through a back door. 


1b2 The Russell Paradox in the Grundgesetze 


In 1902, Russell wrote a letter to Frege [106], in which he informed Frege 
that he had discovered a paradox in Frege’s Begriffsschrift. Russell gave his 
well-known argument, defining the propositional function f(x) by =z(x) (in 
Russell’s words: “to be a predicate that cannot be predicated of itself”). He 
assumed f(f). Then by definition of f, ~f (f), a contradiction. Therefore: 
af(f) holds. But then (again by definition of f), f(f) holds. Russell 
concluded that both f(f) and -f(f) hold, a contradiction. 


?This may be the origin of Russell’s notation £®(x) for the class of objects that have 
the property ®. According to a paper by J. B. Rosser [105], the notation £®(x) has 
been at the basis of the current notation Ax.® in A-calculus. Church is supposed to have 
written Az®(x) for the function z ++ (x), writing the hat in front of the x in order 
to distinguish this function from the class £®(x). For typographical reasons, the A is 
supposed to have changed into a A. On the other hand, J. P. Seldin informed us [111] 
that he had asked Church about it in 1982, and that Church had answered that there 
was no particular reason for choosing A, that some letter was needed and X happened 
to have been chosen. Moreover, Curry had told him that Church had a manuscript in 
which there were many occurrences of X already in 1929, so three years before the paper 
[28] appeared. 
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Only six days later, Frege answered Russell that Russell's derivation 
of the paradox was incorrect [51]. He explained that the self-application 
f(f) is not possible in the Begriffsschrift. f(x) is a function, which re- 
quires an object as an argument, and a function cannot be an object in the 
Begriffsschrift (see 1b1). 

In the same letter, however, Frege explained that Russell’s argument 
could be amended to a paradox in the system of his Grundgesetze, using 
the course-of-values of functions. Frege’s amendment was shortly explained 
in that letter, but he added an appendix of eleven pages to the second 
volume of his Grundgesetze in which he provided a very detailed and correct 
description of the paradox. 

The derivation goes as follows (using the same argument as Frege, 
though replacing Frege’s two-dimensional notation by the nowadays more 
usual one-dimensional notation). First, define the function f(x) by: 


Wel(ay(a) = x) — ¢(z)]. 
Write K = èf (e). By (1) we have, for any function 9(z), 
èg(e) =éf(e) — 9(K) = (K) 
and this implies 
FK) — ((ég(e) = K) — 9(K)). (2) 
As this holds for any function g(x), we have 
FK) — Volèple) = K > y(K)]. (3) 
On the other hand, for any function g, 
Volple) = K — (KI — (gle) = K) — 9(K)). 
Substituting f(x) for g(x) results in: 
Wolple) = K — p(K)] — (Ele) = K) > F(K)) 
and as èf (£) = K by definition of K, 


Volèp(e) = K > (K) — f(K). 
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Using the definition of f, we obtain 
Voleple) = K > 9(K)] — Vpleple) = K > 9(K)], 

hence by reductio ad absurdum, 

Wylay(a) = K > 9(K)], 
or shorthand: 

f(K). (4) 

Applying (3) results in 

Volòp(a) = K > 9(K)], 


which implies 
=Volapla) = K > 9(K)], 


or shorthand: 
of (kK). (5) 


(4) and (5) contradict each other. 


1b3 How wrong was Frege? 


In the history of the Russell Paradox, Frege is often depicted as the pitiful 
person whose system was inconsistent. This suggests that Frege’s system 
was the only one that was inconsistent, and that Frege was very inaccurate 
in his writings. On these points, history does Frege an injustice. 

In fact, Frege’s system was much more accurate than other systems of 
those days. Peano’s work, for instance, was less precise on several points: 


e Peano hardly paid any attention to logic, especially not to quantifi- 
cation theory; 


e Peano did not make a strict distinction between his symbolism and 
the objects underlying this symbolism. Frege was much more accurate 
on this point (see also his paper Uber Sinn und Bedeutung [49)); 
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e Frege made a strict distinction between a proposition (as an object 
of interest or discussion) and the assertion of a proposition. Frege 
denoted a proposition, in general, by —A, and the assertion of the 
proposition by A. The symbol F is still widely used in logic and 
type theory. Peano did not make this distinction and simply wrote 
A. 


Nevertheless, Peano’s work was very popular, for several reasons: 


e Peano had able collaborators, and in general had a better eye for pre- 
sentation and publicity. For instance, he bought his own press, so that 
he could supervise the printing of his journal Rivista di Matematica 
and Formulaire [98]; 


e Peano used a symbolism much more familiar to the notations that 
were used in those days by mathematicians (and many of his nota- 
tions, like € for “is an element of”, and > for logical implication, are 
also used in Russell’s Principia Mathematica, and are actually still in 
use). 


Frege’s work did not have these advantages and was hardly read before 
1902°. In the last paragraph of [50], Frege concluded: 


“... Tobserve merely that the Peano notation is unquestionably 
more convenient for the typesetter, and in many cases takes up 
less room than mine, but that these advantages seem to me, 
due to the inferior perspicuity and logical defectiveness, to have 


3When Peano published his formalisation of mathematics in 1889 [97] he clearly did 
not know Frege’s Begriffsschrift, as he did not mention the work, and was not aware of 
Frege’s formalisation of quantification theory. Peano considered quantification theory to 
be “abstruse” in [98], on which Frege proudly reacted: 


“In this respect my conceptual notion of 1879 is superior to the Peano one. 
Already, at that time, I specified all the laws necessary for my designation 
of generality, so that nothing fundamental remains to be examined. These 
laws are few in number, and I do not know why they should be said to be 
abstruse. If it is otherwise with the Peano conceptual notation, then this 
is due to the unsuitable notation.” 


([50], p. 376) 
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been paid for too dearly — at any rate for the purposes I want 
to pursue.” 


(Ueber die Begriffschrift des Herrn Peano und meine eigene, 
p. 378) 


Frege’s system was not the only paradoxical one. The Russell Paradox 
can be derived in Peano’s system as well, by defining the class 


KS {x |x x} 
and deriving K € K + K ¢ K. In Cantor’s Set Theory one can derive 
the paradox via the same class (or set, in Cantor’s terminology). 


1b4 The importance of Russell’s Paradox 


Russell’s paradox was certainly not the first or only paradox in history. 
Paradoxes were already widely known in antiquity. The first known para- 
dox is the Achilles paradox of Zeno of Elea. It is a purely mathematical 
paradox. Due to a precise formulation of mathematics and especially the 
concept of real numbers, the paradox can now be satisfactorily solved. 

The oldest logical paradox is probably the Liar’s Paradox, also known 
as the Paradox of Epimenides. It can be very shortly formulated by the 
sentence “This sentence is not true”. The paradox was widely known in 
antiquity. For instance, it is referred to in the Bible (Titus 1:12). It is 
based on the confusion between language and meta-language. 

The Burali-Forti paradox ([24], 1897) is the first of the modern para- 
doxes. It is a paradox within Cantor’s theory on ordinal numbers. Cantor’s 
paradox on the largest cardinal number occurs in the same field. It must 
have been discovered by Cantor around 1895, but was not published before 
1932. 

The logicians considered these paradoxes to be out of the scope of logic: 
the paradoxes based on the Liar’s Paradox could be regarded as a problem 
of linguistics, and the paradoxes of Cantor and Burali-Forti occurred 
in a in those days highly questionable part of mathematics: Cantor’s Set 
Theory. 

The Russell Paradox, however, was a paradox that could be formulated 
in all the systems that were presented at the end of the 19th century (except 
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for Frege’s Begriffsschrift). It was at the very basics of logic. It could not 
be disregarded, and a solution to it had to be found. 


Chapter 2 


Type theory in Principia 
Mathematica 


When Russell proved Frege’s Grundgesetze to be inconsistent, Frege was 
not the only person in trouble. In Russell’s letter to Frege (1902), we read: 


“T am on the point of finishing a book on the principles of math- 
ematics” 


(Letter to Frege, [106]) 


Therefore, Russell had to find a solution to the paradoxes, before he could 
finish his book. 

His paper Mathematical logic as based on the theory of types [108] 
(1908), in which a first step is made towards the Ramified Theory of Types, 
started with a description of the most important contradictions that were 
known up till then, including Russell’s own paradox. He then concluded: 


“In all the above contradictions there is a common character- 
istic, which we may describe as self-reference or reflexiveness. 
[...] In each contradiction something is said about all cases of 
some kind, and from what is said a new case seems to be gen- 
erated, which both is and is not of the same kind as the cases 
of which all were concerned in what was said.” 


(Ibid.) 
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Russells plan was, therefore, to avoid the paradoxes by avoiding all 
possible self-references. He postulated the “vicious circle principle”: 


Vicious Circle Principle 2.1 


“Whatever involves all of a collectton must not be one of the 
collection.” 


([81], p. 28) 


Russell applies this principle very strictly. He implemented it using types, 
in particular the so-called ramified types. The theory presented in Mathe- 
matical logic as based on the theory of types was elaborated in Chapter II of 
the Introduction to the famous Principia Mathematica [121] (1910-1912). 
In the Principia, Whitehead and Russell founded mathematics on logic, 
as far as possible. The result was a very formal and accurate build-up of 
mathematics, avoiding the logical paradoxes. 

The logical part of the Principia was based on the works of Frege. This 
was acknowledged by Whitehead and Russell in the preface, and can also 
be seen throughout the description of Type Theory. The notion of function 
is based on Frege’s Abstraction Principles 1.1 and 1.2, and the Principia 
notation f(x) for a class looks very similar to Frege’s éf(e) for course-of- 
values. 

An important difference is that Whitehead and Russell treated functions 
as first-class citizens. Frege used courses-of-values as a way of speaking 
about functions (and was confronted with a paradox); in the Principia a 
direct approach was possible. Equality, for instance, was defined for objects 
as well as for functions by means of Leibniz equality (x = y if and only if 
f(x) > f(y) for all propositional functions f — see [121], 13-11). 

The description of the Ramified Theory of Types (RTT) in the Principia 
was, though extensive, still informal. It is clear that Type Theory had not 
yet become an independent subject. The theory 


“only recommended itself to us in the first instance by its ability 
to solve certain contradictions” 


(Principia Mathematica, p. 37) 
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And though 


“it has also a certain consonance with common sense which 
makes it inherently credible” 


(Principia Mathematica, p. 37) 


(probably, Whitehead and Russell refer to the implicit, intuitive use of 
types by mathematicians. See Section la), Type Theory was not introduced 
because it was interesting on its own, but because it had to serve as a tool 
for logic and mathematics. A formalisation of Type Theory, therefore, was 
not considered in those days. 

Though the description of RTT in the Principia was still informal, it 
was clearly present throughout the work. It was not mentioned very often, 
but when necessary, Russell made a remark on RTT. This is an important 
difference with the earlier writings of Frege, Peano and Cantor. 

If we want to compare RTT with contemporary type systems, we have to 
make a formalisation of RTT. Though there are many descriptions of RTT 
available in the literature (like [30], [32], [64], [101] and Section 27 of [110]), 
none of these descriptions presents a formalisation that is both accurate 
and as close as possible to the ideas of the Principia. We will fill up this 
gap in the literature in the first part of this chapter. 

Making such a formalisation is by no means easy: 


e Important formal notions, especially the notion of substitution, re- 
mained completely unexplained in the Principia; 


e The accuracy of Frege’s work was not present in Russell’s. This was 
already observed by Godel, who said that the precision of Frege was 
lost in the writings of Russell, and who, due to the informality of 
some basic notions of the Principia, had to give his paper [57] the 
title Uber formal unentscheidbare Sätze der Principia Mathematica 
und verwandter Systeme. 


In 1bl we saw that Frege generalised the notion of function from analy- 
sis. For Russell’s formalisation of mathematics within logic, a special kind 
of these functions was needed: the so-called propositional functions. A 
propositional function (pf) always returns a proposition when it is applied 
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to suitable arguments. In Section 2a, we introduce a formalised version of 
these pfs. This makes it possible to compare pfs with other formal systems, 
like A-calculus, and to give a precise definition of substitution. 

In Section 2b we give a formalisation of Russell's notion of ramified 
type (2b1), followed by a formal definition of the notion the pf f is of type 
t (2b2). We motivate this definition (2b3) by referring to passages in the 
Principia. As the formalisation of pf is precise enough to be translated 
to A-calculus, we can make a comparison between RTT and current type 
systems. 

Thanks to our formal notation and its relation with A-calculus, we are 
able to prove properties of RTT in an easy way, using properties of modern 
type systems. This will be done in Section 2c. Due to the new notation it 
is relatively easy to see that we have proved variants of well-known theo- 
rems from Type Theory, like Strong Normalisation, Free Variable Lemma, 
Strengthening Lemma, Unicity of Types and Subterm Lemma. 

In Section 2d we answer in full detail the question which pfs are typable. 
We also make a comparison between our notion of typable pf, and the 
corresponding notion in the Principia, and conclude that these two notions 
of typable pf coincide. 

Parts of this chapter have been taken from earlier publications: [79], 
[80] and [82]. 
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In this section we present a formalisation of the propositional functions (pfs) 
of the Principia. In Section 2al we give a syntax that is as close as possible 
to the ideas of the Principia. Intuition about this syntax is provided in 
Section 2a2 by translating pfs into A-terms. In Section 2a3 we define some 
related notions that are needed in the rest of the Chapter. We devote a 
special section 2a4 to the notion of substitution. This notion is clearly 
present in the Principia, but not formally defined. Due to the translation 
to A-calculus of Section 2a2, we are able to give a precise definition. 


2al Definition 


The definition of propositional function in the Principia is as follows: 
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“By a “propositional function” we mean something which con- 
tains a variable x, and expresses a proposition as soon as a value 
is assigned to x.” 


(Principia Mathematica, p. 38) 


Pfs are, however, constructed from propositions with the use of the Abstrac- 
tion Principles: they arise when in a proposition one or more occurrences 
of a sign are replaced by a variable. Therefore we have to begin our formal- 
isation with certain basic propositions, certain basic signs, and signs that 
indicate a replaceable object. For this purpose we use 


e A set A of individual symbols (the basic signs); 
e A set V of variables (the signs that indicate replaceable objects); 


e A set R of relation symbols together with a map a: R — N? indi- 
cating the arity of each relation-symbol (these are used to form the 
basic propositions). 


We want to have a sufficient supply of individual symbols, variables and 
relation symbols and therefore assume that A and Y are infinite (but count- 
able), and that {R € R | a(R) = n} is infinite (but countable) for each 
n € Nt. We assume that {a;,a2,...} C A, {x,y,z,x1,...} C V and 
{R,S,...} CR. We use aj,a2,... as metavariables over A; 2,y,Z,21,.-. 
as metavariables over Y and R,S,... as metavariables over R. For techni- 
cal reasons we assume that there is an order (e.g. alphabetical) on Y. We 
write x < y if x is ordered before y, and not equal to y (so: < is strict). In 
particular, we assume that 


x<Xı<...y<Yı<...2<2Zı... 
and: for each x there is a y with z < y. 
Definition 2.2 (Atomic propositions) A list of symbols of the form 
R(aı,...,aa(r)) is called an atomic proposition. 


Other names used for these atomic propositions in the Principia are ele- 
mentary judgements and elementary propositions (cf. [121], pp. xv, 43-45, 
and 91). 

Propositional functions in Principia Mathematica are generated from 
atomic propositions by two means: 
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e The use of logical connectives and quantifiers; 


e Abstraction from (earlier generated) propositional functions, using 
the abstraction principles. 


This leads to the following formal definition of propositional function. 
Examples are given in 2.5 and intuition is provided in Section 2a2. 


Definition 2.3 (Propositional functions) We define a collection P of 
propositional functions (pfs), and for each element f of P we simultaneously 
define the collection Fv(f) of free variables of f: 


1. If t1,- -s a(R) € AUY then R(i1,... ‚la(R)) EP. 
A A def ç. . 
FV(R(i1,.--,ta¢ny))- = Wesen FO 


2. If f,gePthenfvgePand-feP. 
FV(f Vg) = Fv(f) U rv(g); rv(of) © Fv(f); 


3. If f € P and z € Fv(f) then Vz[f] € P. 
def 


FV(Vz[f]) = Fv(f) \ {x}; 


4. Ifn E€ N and ky,...,kn € AUV UP, then z(ky,...,kn) € P. 


FV(z(k1,...,kn)) E {zkr kn} NV. 


If n = 0 then we write z() in order to distinguish the pf z() from the 
variable z!; 


5. All pfs can be constructed by using the construction-rules 1, 2, 3 and 
4 above. 


We use the letters f,g, h as meta-variables over P. 


Definition 2.4 (Propositions) A propositional function f is a proposi- 
tion if FV(f) = Ø. 


Example 2.5 We give some examples of (higher-order) pfs of the form 
z(kı,...,kn) in ordinary mathematics. To keep the link with mathematics 
clear, we use some extra logical connectives like > and —. 


lIt is important to note that a variable is not a pf. See for instance [107], Chapter 
VII: “The variable”, p. 94 of the 7th impression. 
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1. The pfs z(x) and z(y) in the definition of equality according to Leib- 
niz: By definition x = y if and only if 


Vz|z(x) > z(y)]; 


2. The pfs z(0), z(x) and z(y) in the formulation of the principle of 
mathematical induction: 


Vz[z(0) — (Vxvylz(x) > (S(x,y) > z(y))]) 
> Vx[2(x)]] 


(we suppose that the relation symbol S represents the successor func- 
tion: S(x,y) holds if and only if y is the successor of x); 


3. z() in the formulation of the law of the excluded middle: | 


Vz[z() V z()]. 


2a2 Propositional functions as \-terms 


The binding structure and the notion of free variable of pfs become more 
clear if we translate pfs to A-terms. Moreover, such a translation will be use- 
ful at several places in this Chapter, for instance when we give a definition 
of substitution. 

We first translate one of the examples of Example 2.5. Then we give 
a formal definition of the translation that we have in mind. After that we 
provide additional remarks and intuition on pfs. 


Example 2.6 Consider the pf f = Vz|z(x) + z(y)] of Example 2.5.1. Two 
objects x and y are Leibniz-equal if and only if they share the same proper- 
ties. These objects are represented by the variables x and y. The variable 
z is a variable for properties of objects, in other words: predicates over ob- 
jects. Such a predicate is a function that takes the object as argument, and 
returns a truth value. The expression z(x) indicates that the predicate that 
is taken for z must be applied to the object that is taken for x. Therefore, 
we translate z(x) by an application of z to x in A-calculus: zx. Similarly 
we translate the expression z(y) by zy. 
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Just as in [30], we can interpret logical connectives as functions. There- 
fore we can translate z(x) > z(y) by the A-term +(zx)(zy). We handle the 
translation of universal quantification also as in [30], hence Vz[...] trans- 
lates to V(Az....). As an effect we get a A-term V(Az.+(zx)(zy)) with 
two free variables, x and y. But we want to have a function taking two 
arguments. This can be solved by a double A-abstraction. The final result 
is Ax.Ay.V(Az.(>(zx)(zy))). 

We remark that the pf f has two free variables, x and y. These two free 
variables correspond to the two arguments that the propositional function 
takes, and therefore to the two A-abstractions that are at the front of the 
translation of f. 


In the following definition, we translate the propositional functions to 
A-terms in a similar way as we did in Example 2.6. Let f € P and let 
T1 < +++ < Tm be the free variables of f. We define a A-term f. We do 
this in such a way that f = Azı.--- Axm-F, where F is a A-term that is not 
of the form Az.F’. To keep notations uniform, we also give translations a 
for a € A and Z for x € V. To keep notations short, we use AJL, 2;.F as 
shorthand for A2j.---ALm.F. 


Definition 2.7 
° zE a for ae A; 


def 
e TE rforzeV; 


e Now assume f € P has free variables zj < --- < £m. Use induction 
on the structure of f: 


: : = def \m ; ; 
= f = R(t, ... , İa(R))- Then f = Nel Ti.Ri ve * ta(R)s 


= f= fi Veja. We can assume that for j = 1,2, fj = A, y!.F;, 
where y] <--- < ym, are the free variables of f;. 


Then F GA, zi. VAR. 


If f = -f’ then we can assume that f! = AT, 2;.F, because 
i=l 


T1 < °° < £m are the free variables of f’. Let f = A Ei 


— f=z(kı,...,kn). Let f = ALi zizki kn; 
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— f = Vz[f']. We can assume that f = wa Ti. ÀT. An, LiF, be- 
cause £1,...,2m,x are the free variables of ae 
Define f = AfL; zi. V(Az.F). 


Example 2.8 


Ax.Rx 


( 
z(R(x),S(a)) | Az.z(Ax.Rx)(Sa) 
z(a) V z2() | Azı.Az2.V(zıa)z2 
z(y(R(x))) | Az.z(Ay.y(Ax.Rx)) 
Vx[R(x)] V(Ax.Rx) 


By induction on the structure of f one can prove the following properties 
of f: 


Lemma 2.9 (Properties of ~) Let f € P. 
1. FV(f) = 2; 
2. f is in B-normal form; . 


3. f is a Al-term; 


m 


4. Ifzı <--- < Em are the free variables of f, then f = AT, x;.F, where 
F is not of the form Ax.F'. 


RX 


Observe that we use FV for indicating both the free variables of a pf 
and the free variables of a A-term. We take care that it will always be clear 
in which meaning we use FV. 

We make some remarks on the definition of propositional function 2.3. 


Remark 2.10 We show that the propositional functions of Definition 2.3 
are indeed objects that exist in the theory of Russell. 


1. In Rule 1 we describe the atomic propositions, and the atomic propo- 
sitions in which one or more individuals have been replaced by vari- 
ables due to one or more applications of the abstraction principles. 
The abstraction principles are not only present in the works of Frege, 
but also in the Principia (cf. for instance *9-14 and *9-15); 
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2. Rule 2 describes the use of the logical connectives V and —. These 
logical connectives are also used in the Principia. Implication?, con- 
junction? and logical equivalence’ are defined in terms of negation and 
disjunction. In examples, we sometimes use symbols for implication, 
conjunction and logical equivalence as abbreviations; 


3. Rule 3 describes the use of the universal quantifier. It is explicitly 
stated in the Principia (cf. pp. 14-16) that the pf Vz[f] can only 
be constructed if f is a pf that contains x as a variable. Existential 
quantification® is defined in terms of negation and universal quantifi- 
cation; 


4. Rule 4 is also an instantiation of the abstraction principle. The pfs 
that can be constructed by using the construction-rules 1-3 only are 
exactly the pfs of what in these days would be called first-order pred- 
icate logic. With rule 4, higher-order pfs can be constructed. This is 
based on the following idea. Let f be a (fixed) pf in which kı,..., kn 
occur. We can interpret f as an instantiation of a function that has 
taken arguments kı,...,kn. We now generalise this to z(kı,..., kn), 
representing any function z taking these arguments. Such a construc- 
tion is also explicitly present in the Principia: 


“the first matrices® that occur are those whose values are 
of the forms pr, 7(z,y),x(z,y,2,---), te. where the ar- 
guments, however many there may be, are all individuals. 
Such [propositional] functions we will call ‘first-order func- 
tions.’ We may now introduce a notation to express ‘any 
first-order function.’ ” 


(Principia Mathematica, p. 51) 


Remark 2.11 The definition of free variable needs some special attention. 
We must notice that, for instance, 


Fv(z(R(x), S(a))) = {z} 


?cf. Principia, «1-01, p. 94 

cf. Principia, *3-01, p. 107 

tcf. Principia, *4-01, p. 117 

Sef. Principia, *10-01, p. 140 

Ssee Remark 2.13 [footnote of the author]. 
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and not {x,z}. The reason for this is that the notion of free variable 
should harmonise with the intuitive notion of “argument place” of Frege 
and Russell. As was indicated in Remark 2.10.4, z represents an arbitrary 
function that takes R(x) and S(a) as arguments and returns a proposition. 
This means that we do not have to supply an argument for x “by hand”. 
As soon as we feed a suitable’ argument f to z in z(R(x), S(a)), f will take 
the arguments R(x) and S(a), and return a proposition. 

This idea is also clearly reflected in the translation of z(R(x),S(a)) to 
the A-term Az.z(Ax.Rx)(Sa). The variable x is bound in a subterm Ax.Rx 
that is an argument to the variable z. The full A-term is a function of z 
only. 


Remark 2.12 In the Introduction we suggested that functionalisation can 
be represented in A-calculus by first making a G-expansion, and then re- 
moving the argument. The translation of Definition 2.7 enables us to show 
that functionalisation in the theories of Russell and Frege (using the Ab- 
straction Principles) is similar to functionalisation in A-calculus. We do this 
by giving some examples. Consider the pf R(a) V S(a). There are several 
ways to apply the abstraction possibilities to this pf: 


1. The list of symbols R(a) can be seen as an instance of the pf z(a). 
z(a) is a pf that takes a unary propositional function as an argu- 
ment and returns the value of that pf for the argument a. Apply- 
ing abstraction to R(a) in R(a) V S(a) results in z(a) V S(a). In A- 
calculus: expand the A-term V(Ra)(Sa) to V((Ax.Rx)a)(Sa), and then 
to (Az.V(za)(Sa))(Ax.Rx). Remove the argument Ax.Rx and we have 
the translation of z(a) V S(a); 


2. But one could also consider the expression R(a)VS(a) as an instance of 
the pfthat takes two propositions and returns their disjunction: the pf 
zı()Vz2(). In A-calculus: V(Ra)(Sa) -expands to (Az2.V(Ra)z2)(Sa). 
Removing the argument gives \z2.V(Ra)z2. A similar operation on 
Ra results in Az4.Àzo.Vz41 zo; 


3. More abstract: One could consider R(a) V S(a) as an instance of the 
pf z(R(a),S(a)) for the argument zı() V zo(). The pf z(R(a),S(a)) 


7 At this stage, we cannot provide a formalisation of “suitable”. This can only be done 
after we have introduced types, and formalised the notion “the pf f is of type t” 
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takes one argument, say f, (for the variable z). Such an argument 
f in its turn, needs to be a pf taking two propositions as arguments. 
In A-calculus, V(Ra)(Sa) expands to (Az.z(Ra)(Sa))V. Removing the 
argument gives Az.z(Ra)(Sa). 


Applying z(R(a), S(a)) to f results in the pf f, evaluated for the ar- 
guments R(a) and S(a); 


. And even more mind-bogglingly (for persevering readers only): R(a) V 


S(a) is an instance of the pf z(R(x), S(a)) for the argument zi (a) Vza(). 
The pf z(R(x), S(a)) takes one argument (for z). Such an argument 
f in its turn, must be a pf that takes one pf and one proposition 
as arguments. The pf-argument of f must take one individual as an 
argument. If we evaluate z(R(x),S(a)) for the argument f, then we 
get as a result the value of f for the arguments R(x) and S(a). 


In A-calculus: V(Ra)(Sa) J-expands to V((Ax.Rx)a)(Sa), then to 
(Azı.Az2.V(zıa)z2)(Ax.Rx)(Sa), 
and finally to 
(Az.z(Ax.Rx)(Sa))(Azı-Az2.V(zıa)za). 


Removing the argument gives Az.z(Ax.Rx)(Sa). 


Let us check what we have done by evaluating z(R(x),S(a)) for the 
argument zı(a)Vz2(). Above, we argued that R(a)VS(a) is an instance 
of z(R(x),S(a)) for the argument zi (a) V z2(), so as a result we must 
obtain R(a) V S(a). 

According to the description of z(R(x),S(a)) above, assigning the 
value zı(a) V z2() to z is equal to the value of zı(a) V za() for the 
arguments R(x) and S(a). Substituting R(x) for zi gives the value 
of R(x) for the argument a: R(a). Substituting S(a) for za gives the 
proposition S(a). The final result is indeed R(a) VS(a). This calcula- 
tion can also be carried out in A-calculus by applying Az.z(Ax.Rx)(Sa)) 
to Azı-AZ2.V(zıa)z2) and reducing to -normal form. 


All these abstractions are in line with 1.1 and 1.2 on page 18. Via the 
abstraction in the last two examples one also obtains pfs of the form 
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z(kı,...,kn) where some of the k; are elements of P. There is no for- 
mal definition of abstraction in the works of Frege and Russell. We could 
use a definition that is related to ß-expansion. See Remark 2.29. 


Remark 2.13 It appears that there is also an alternative way of construct- 
ing pfs in the Principia. Whitehead and Russell distinguish between quan- 
tifier-free pfs (so-called matrices, i.e. the pfs that can be constructed using 
construction-rules 1, 2 and 4). Then they form pfs by defining that 


e Any matrix is a pf; 


e If f is a pf and x € Fv(f) then Vz[f] is a pf with free variables 
FV(f) \ {2}. 


This definition is a little different from our Definition 2.3, as a pf of the form 
z(Vz[f]) is not a matrix and therefore not a pf according to this alternative 
definition. Nevertheless we feel that Whitehead and Russell intended to 
give our Definition 2.3. In the Principia ([121], #54) they define the natural 
number 0 as the propositional function Vx[-z(x)]®. In defining the principle 
of induction on natural numbers, one needs to express the property “0 has 
the property y”, or: y(0). But y(0) is not a pf according to this alternative 
definition, as 0 contains quantifiers. 

Therefore we feel that our Definition 2.3, which is also based on the def- 
inition of function by Frege and on the definition of propositional function 
on p. 38 of the Principia, is the definition that was meant by Whitehead 
and Russell. 


Remark 2.14 Note that pfs as such do not yet obey to the vicious circle 
principle 2.1! For example, -z(z) (the pf that is at the basis of the Russell 
paradox) is a pf. In Section 2b we will assign types to some pfs, and it will 
be shown (Remark 2.66) that no type can be assigned to the pf -z(z). 


®This definition is based on Frege’s definition in Grundlagen der Arithmetik [46] 
(1884). See [121], vol. II, p. 4. In [46], the natural number n is defined as the class 
of predicates f for which there are exactly n objects a for which f(a) holds. Hence 0 
is the class of predicates f for which f(a) does not hold for any object a. So 0 can be 
described by the pf Vx[7z(x)] 
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Remark 2.15 Before we make further developments of the theory based 
on pfs, we must decide which of the two syntaxes introduced above shall 
be used in the sequel. It looks attractive to use the syntax of A-calculus: 


e This syntax is well-known; 


e It is used for many other type systems, so it makes the comparison 
of ramified type theory with modern type systems easier; 


e There is a lot of meta-theory on typed and untyped A-calculus. This 
can be useful when proving certain properties of the formalisation 
of the ramified theory of types that is to be introduced in the next 
sections; 


e The syntax of A-calculus gives a better look on the notion of free 
variable than the syntax of pfs. 


Nevertheless, we shall only indirectly use A-calculus for our further 
study of the ramified type theory in this Chapter. We have several rea- 
sons for that: 


e There are much more A-terms than there are pfs. More precise, the 
mapping ~ is not surjective. As we want to study the theory of Prin- 
cipia Mathematica as precise as possible, we only want to study the 
propositional functions, which are directly related to the syntax used 
by Russell and Whitehead. Not using pf-syntax may result in a sys- 
tem in which it is not clear which term belongs to the original ramified 
type theory and which term does not; 


e The syntax of A-calculus is strongly curried. This would give prob- 
lems in the definition of substitution. In a pf R(x,y) we may want 
to substitute some object a for y without substituting anything for 
x. In A-calculus, substitution should be translated to application fol- 
lowed by f-reduction to G-normal form. If we want to substitute 
something for y in the translation Ax.Ay.Rxy of R(x,y), we have to 
substitute something for x first. Choosing a different representation 
of propositional functions does not help: the representation Ay.Ax.Ryx 
would have given problems if we wanted to substitute something for 
x without substituting something for y; 
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e The translation of pfs to A-calculus makes it possible to use the meta- 
theory and the intuition of A-calculus when we need it without losing 
control over the original system. 


2a3 Related notions 


We proceed our discussion of pfs by defining a number of related no- 
tions. If a pf z(kı,...,kn) takes an argument f for the variable z, the 
list kı,...,kn indicates what should be substituted for the free variables of 
f (cf. also Remark 2.10.4). We therefore call this list the list of parameters 
of z(kı,...,kn). A formal definition: 


Definition 2.16 (Parameters) Assume f is a pf, and k € AUVUP. We 
define, inductively, the notion k is a parameter of f, and write PAR(f) for 
the set of parameters of f. 


e PAR(R(i1,-..,ta(Ry)) = fiel) 
e PAR(fı V fa) = PAR( f1) U PAR( f2) and PAR(—f) = PAR(f); 


e PAR(Vz[f]) £ PAR( f); 


e PAR(z(kı,...,kn)) EF Leken} 


Note that x is not a parameter of z(R(x), S(a)), but it is a recursive param- 
eter according to the following definition: 


Definition 2.17 (Recursive parameters) Assume f is a pf. We define, 
inductively, the set of recursive parameters of f, RP(f): 


e RP(R(i, dar) © (in ia} 


e RP(of) Č RP(f); RP(A V fo) © RP(fi) URP(f); 


def 


e RP(Vz[f]) = RP(f); 


o RP(2(k1,...,4n)) = {ki kn} U UpepRP(ki). 
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Another important notion is the notion of a-equality®. We want the 
pfs R(x) and R(y) to be the same. However, we want the pfs S(x,y) and 
S(y,x) to be different. The reason for this is the alphabetical order of the 
variables x, y. As x < y, we will consider x to be the “first” variable of the 
pfs S(x,y) and S(y, x), and y the “second” variable. The place of the “first” 
variable in S(x, y), however, is different from the place of the “first” variable 
in S(y,x).!° We therefore present the following definition of a-equality: 


Definition 2.18 (a-equality) Let f and g be pfs. We say that f and g 
are a-equal, notation f =, 9, if there is a bijection y : V — V such that 


e g can be obtained from f by replacing each variable that occurs in f 
by its y-image; 


e x< y iff plz) < p(y). 


This definition corresponds to the definition of a-equality in A-calculus 
in the following way: 


Lemma 2.19 Let f,g EP. f =a 9 if and only if f =a 9. R 


Sometimes, we are not that precise, and want the pfs S(x,y) and S(y, x) 
to be a-equal. This can be a consideration especially if we are not interested 
in which free variable is “first” and which is “second”. We call this weakened 
notion of a-equality: ap-equality (a-equality modulo permutation): 


Definition 2.20 (a-equality modulo permutation) Let f and g be pfs. 
We say that f and g are ap-equal, notation f =ap 9, if there is a bijection 
y:¥Y¥-— V such that g can be obtained from f by replacing each variable 
that occurs in f by its y-image. 


Historically, it is not correct to use this terminology when discussing Type Theory 
of the Principia, which dates from the first decade of this century. The term a-equality 
originates from Curry and Feys’ book Combinatory Logic [38], which appeared only in 
1958. In this book, conversion rules for the A-calculus are numbered with Greek letters 
a, 8. The rule numbered with œ is now known as a-conversion; the rule numbered 
with @ is now known as ß-conversion. In earlier papers of Church, Rosser and Kleene, 
these rules were numbered with Roman capitals I, II, and the terminology a-conversion, 
B-conversion, was not used. 

Compare this with their equivalents in A-calculus Ax.Ay.Sxy and Ax.Ay.Syx, which 
are not «-equal, either. We do not want to use the A-notation for determining which 
variable is “first” and which is “second”, for reasons to be explained in Remarks 2.15 and 
2.32. 
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2a4 Substitution 


In the Introduction we argued that instantiation is the inverse operator of 
function construction. In Remark 2.12 we saw that function construction in 
Principia Mathematica can be compared to f-expansion plus removing an 
argument in A-calculus. This suggests that instantiation in the Principia 
must be comparable to application plus -reduction in A-calculus. In [79] 
we showed that this is indeed the case. There, we gave a laborious definition 
of instantiation using the syntax of and the intuition behind pfs. We showed 
that this definition is faithful to the original ideas of the Principia and that 
it can be imitated in A-calculus using a translation similar to the one in 
Definition 2.7. This allows us to give a definition of substitution for pfs 
that is based on that imitation in X-calculus. 

As was argued in Remark 2.15, the mapping f + f is not perfectly 
suited for a definition of substitution. This was due to the currying of the 
A-abstractions that are at the front of the term f. We therefore take a 
slightly different notation and remove these front abstractions from f: 


Definition 2.21 Let f € P with free variables zj < --- < Im. Then 


f = ML, wi. F for some A-term F. Let fe F. 


Example 2.22 


f f 
R(x) 
z(R(x),S(a)) | z(Ax.Rx)(Sa) 
zı(a) V zo() V(zıa)z2 
z(y(R(x))) | z(Ay.y(Ax.Rx)) 
Yx[R(x)] V(Ax.Rx) 


The mapping f > f has similar properties as f + f (cf. Lemma 2.9): 
Lemma 2.23 (Properties of ~) 

1. FV(f) = ev(f) 8 

2. f is in B-normal form for all f; 


3. fis a Al-term for all f; 
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Klier En:=91;,--- 39n] = h 


| | 


(Azı zn. m ah 
Figure 2: Substitution via f-reduction 


4. f is a closure (see A.7) of f; 
5. If f =a 9, then f =a 9. 
X 


With the A-notation we can rely on the notions of G-reduction and ß-normal 
form to give the following definition of substitution: 


Definition 2.24 (Substitution) Let f € P, assume 21,...,2, are dis- 
tinct variables, and 91,...,9n € AUVUP. Assume that the A-term 


Artech 


has a G-normal form H. Assume h € P such that h=H (If such an h exists, 


it is unique due to Lemma 2.23.5). Then f[21,...,¢n:=g1,---,9n] in 


We sometimes abbreviate f[xı,... ,‚2n:=91,..-,gn] to flei:=gi]?_, or 
fl@=g]. 


So substitution in RTT can be seen as application plus G-reduction to 6- 
normal form in A-calculus. Definition 2.24 is schematically reflected in 
Figure 2. Notice that f[r1,...,2%n:=91,---,9n] should be seen as a simul- 
taneous substitution of g1,...,gn for 21,...,%n. As the 9s are either 
closed A-terms, or individuals, or variables, it is no problem to define this 
simultaneous substitution via a list of applications that results in a list of 
consecutive substitutions. 


Example 2.25 


1. S(x1)[x1:=ai] = S(aı), as (Ax1.Sx1)ay = Say; 
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2. S(xı)[x2:=a2] = S(xı), as (Ax2.Sx1)ag anf $x1; 


3. z(S(x1),X2, a2)[x1:=a,] = 2(S(x1), x2, a2) as 
(Axı.z(Axı.Sx1)X2a2)aı af z(Axı.Sxı)X2an. 
This illustrates that the A-notation is more precise and convenient 
with respect to free variables. In z(S(x1), x2, a2), it is not immediately 
clear whether x; is a free variable or not and one might tend to write 
z(S(x1), Xo, a2)[xX1:=a,] = z(S(a1),x2,a2). The A-notation is more 
explicit in showing that x, ¢ Fv(z(S(x1), x2, a2)); 


4. See Remarks 2.10.3. 
z(R(a), S(a))[z:=z1() V 22()] = R(a) V S(a), as 
(Az.z(Ra)(Sa))(Az1z2.Vz1z2) >g 
(Azı22.Vzı2,2)(Ra)(Sa) > V(Ra)(Sa); 


5. X2(x1,R(x1))[xo:=x4(x3)] = R(x1) as 
(Ax2.%2%4(Ax1-Rx1))(Ax3X4.X4X3) —>g 
(Ax3X4-%4X3)%1(Ax1.Rx1) > 
(Axı-Rxı)Xı u Rxı. 


Remark 2.26 fIxı,...,2n:=91,..:,9n] is not always defined. For its ex- 
istence we need: 


e The existence of the normal form H in Definition 2.24. For instance, 
this normal form does not exist if we choose n = 1, f = x1(x1) 
and gı = xı(%ı): then we obtain for the calculation of f[x1:=gi] the 
famous A-term (Ax1.XıXı)(Axı.XıXı); 


e The existence of a (unique) h such that h = H. For instance, if we 
taken = 1, f = z(a) (with z € V and a € A) and gı = a, then 
H = aa and there is no he P such that h = aa. 


In Section 2c2 we will prove that, as long as we are within the type system 
RTT (to be introduced in Section 2b), both H and h always exist uniquely 
(Corollary 2.73). Until then, the notation f[ri,...,¢n:=91,..-,9n| = h 
implicitly assumes that the substitution ezists. 


Remark 2.27 If we compute a substitution f[z1,...,In:=91,---,9n], we 
have to reduce the A-term (Az, ---¢n.f)Gi--- m to its -normal form (if 
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there is any). One might wonder whether this is too restrictive: In a 
reduction path to this normal form, there may be an intermediate result 
H that could be interpreted as the final result of the substitution f[&:=3]. 
However, this never happens, as any term that can be interpreted as such 
a result is always of the form h, and is therefore always in -normal form 
(Lemma 2.23.2). 


Remark 2.28 The alphabetical order of the variables plays a crucial role 
in the substitution process, as it determines in which order the free variables 
of a pf f are curried in the translation f. For example, look at the substitu- 
tions z(a, b)[z:=R(x,y)] and z(a, b)[z:=R(y, x)]. The result of the first one is 
obtained via the normal form of (Az.zab)(Axy.Rxy), which is equal to Rab, 
translated: R(a,b). The second one is calculated via (Az.zab)(Axy.Ryx), 
resulting in Rba and R(b, a). 


Remark 2.29 Now that substitution has been properly defined, we could 
define that f is an abstraction of g if there are zı,...,2n € Fv(f) and 
hi,...,hn € AUP such that fl&:=h] = g, or, in A-calculus notation: 


(Ar1 +++ 2n-f)hi---hn —g 9. The set of abstractions of a pf g is therefore 
comparable with the set of G-expansions of the A-term 9. 


Some elementary calculation with substitutions can be done using the 
following lemma: 


Lemma 2.30 
1. Assume (fı V fo)[#:=h] exists. Then f;|£:=h] exists for j = 1,2, and 
(fi V fo) l@=h] = (f1 [Z:=}) V (fola:=h)); 
2. Assume (f)[ö:=h] exists. Then f[ö:=h] exists, and 
(PIE: =R] = (flE=h)); 
3. Assume (Va:t*[f])[#:=h] exists, and x ¢ Z. Then f[#:=h] exists, and 


(Ve:t*[f])[z:=h] = Va:t*[f [z:=h)]; 
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4. Assume z(ky,...,kn)[z:=f] exists, and zj < --- < zn are the free 


=>. 


variables of f. Then f(Z:=k] exists, and 


z(kı, En , kn)[z:=f] = fle:=k}; 


5. Assume z(ky,..., kn)[#:=h] exists, z = Tp, and yı < +++ < Yn are 
the free variables of kp € P. Define ki = h; if ki = zj, and ki = ki 
otherwise. Then kplg:=F' exists, and 


2(ky,...,kn)[#:=A!] = kp[yj:=F'. 


PROOF: Directly from the definition of substitution. B 


2b The Ramified Theory of Types 


After we have formalised the notion of propositional function in Section 
2a we now give a precise description of the type theory underlying the 
Principia. First we explicitly introduce types (Section 2b1 — there is no 
such introduction in Principia), and then we formalise the notion “the 
propositional function f has type t” (Section 2b2). 


2b1 Types 


Types in the Principia have a double hierarchy: one of (simple) types and 
one of orders. In Section 2b1.1 we introduce the first hierarchy. In Section 
2b1.2 we extend this hierarchy with orders, resulting in the ramified types 
of the Principia. 


2b1.1 Simple types 


As we saw in Section 1b, Frege already distinguished between objects, func- 
tions that take objects as arguments, and functions that take functions as 
arguments. He also made a distinction between functions that take one and 
functions that take two arguments (see the quotations from Function and 
Concept on p. 19). In the Principia, Whitehead and Russell use a simi- 
lar principle. Whilst Frege’s argument for this distinction was only that 
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functions are fundamentally different from objects, and that functions tak- 
ing objects as arguments are fundamentally different from functions taking 
functions as arguments, Whitehead and Russell are more precise: 


“[The difference between objects and propositional functions) 
arises from the fact that a [propositional] function is essentially 
an ambiguity, and that, if it is to occur in a definite proposition, 
it must occur in such a way that the ambiguity has disappeared, 
and a wholly unambiguous statement has resulted.” 


(Principia Mathematica, p. 47) 


There is no definition of “type” in the Principia, only a definition of “being 


of the same type”:!! 


“Definition of “being of the same type.” The following is a step- 
by-step definition, the definition for higher types presupposing 
that for lower types. We say that u and v “are of the same 
type” if 


1. both are individuals, 


2. both are elementary [propositional] functions!? taking ar- 
guments of the same type, 


3. u is a pf and v is its negation, 
4. u is pêl3 or Wa, and v is pf V pf, where pê and Wz are 
elementary pfs, 


5. wis (y).p(é, y)4 and v is (z).W(, z), where p(&, 5), W(L, 9) 
are of the same type, 


6. both are elementary propositions, 


See Definition 2.2 for the notion of elementary proposition. In the Principia, an 
elementary pf is a pf that has elementary propositions as values, when it takes suitable 
arguments. 

“The term elementary functions refers to a pf that has only elementary propositions 
as value, when it takes suitable (well-typed) arguments. See Principia, p. 92. 

13Whitehead and Russell use p& to denote that ọ is a pf that has, amongst others, x 
as a free variable. Similarly, they use p(£,%) to indicate that p has z,y amongst its free 
variables. 

‘Whitehead and Russell write (x).p(z) where we would write Yzf). 
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15 


7. u isa proposition and v is vu”, or 


8. wis (v).pr and v is (y).wy, where pf and W# are of the 
same type.” 


(Principia Mathematica, *9-131, p. 133) 


The definition has to be seen as the definition of an equivalence relation. 
For instance, assume that pf, yê and xê are elementary pfs. By rule 4, 
pf and px V Wz are of the same type, and so are pf and pf V xf. By 
(implicit) transitivity, pf V dE and pi V x£ are of the same type. 

The definition seems rather precise at first sight. But there are several 
remarks to be made: 


e The notion “being of the same type” seems to be defined for pfs taking 
one argument only. On the other hand, rules 2 and 5 suggest that 
such a definition should be extended to pfs taking two arguments. 
How this should be done is not made explicit; 


e According to this definition, z;() V =zi() is not of the same type 
as zı(). The only rules by which could be derived that z,() and 
zı() V =zi() are of the same type, are rules 2 and 4. But if we want 
to use these rules, zı() must be an elementary pf, which it is not: It 
can take the argument Vx[R(x)], which has as result the proposition 
Vx[R(x)]. This is not an elementary proposition and therefore zi () is 
not an elementary pf. 


So there are quite some omissions in this definition. However, the intention 
of the definition is clear: pfs that take a different number of arguments, or 
that take arguments of different types, cannot be of the same type. 

In order to make precise what is meant by “being of the same type”, 
it is easier to explicate what these types “are”. The notion “being of the 
same type” can then be replaced by “having the same type”. The notion 
of simple type as defined below is due to Ramsey [101] (1926). Historically, 
it is incorrect to give Ramsey’s definition of simple type before Russell’s 
definition of ramified type, as Russell’s definition is of an earlier date, and 
Ramsey’s definition is in fact based on Russell’s ideas and not the other 
way around. On the other hand, the ideas behind simple types were already 


15 _u is Principia notation for Tu. 
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explained by Frege (see the quotes from Function and Concept on page 19). 
Moreover, knowledge of the intuition behind simple types will make it easier 
to understand the ramified ones. Therefore we present Ramsey’s definition 
first. 


Definition 2.31 (Simple types) 
1. O is a simple type; 


2. If tı,...,t„ are simple types, then also (tı,...,tn) is a simple type. 
n = 0 is allowed: then we obtain the simple type (); 


3. All simple types can be constructed using the rules 1 and 2. 
We use t,u,tı,... as metavariables over simple types. 


Here, (tı,...,tn) is the type of pfs that should take n arguments (have n 
free variables), the ith argument having type t;. The type () stands for 
the type of the propositions, and the type 0 stands for the type of the 
individuals. 


Remark 2.32 To formalise the notion of ith argument that a pf takes, we 
use the alphabetical order on variables that was introduced in Section 2a. 
The ith argument taken by a pf will be substituted for the ith free variable 
of that pf, according to the alphabetical order. 

Now it becomes clear why we considered the alphabetical order of vari- 
ables in the definition of a-equality 2.18: we want a-equal pfs to have the 
same type. However, if f has type (tı,t2) and two free variables x < y, 
and g is the same as f except that the roles of x and y have been switched, 
then g will have type (t2,t,). Therefore we demand that the renaming of 
variables must maintain the alphabetical order. See also Remark 2.47.7. 


Example 2.33 The propositional function R(x) should have type (0), as 
it takes one individual as argument. 

The propositional function z(R(x), S(a)) (see Remark 2.10.4) takes one 
argument. This argument must be a pf that can take R(x) as its first 
argument (so this first argument must be of type (0)), and a proposition 
(of type ()) as its second argument. We conclude that in z(R(x),S(a)), we 
must substitute pfs of type ((0), ()) for z. Therefore, z(R(x),S(a)) has type 


(0), ()))- 
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The intuition presented in Remark 2.32 and Example 2.33 will be formalised 
in 2.45. Theorem 2.58 shows that this formalisation follows the intuition. 

Just as propositional functions can be translated to A-terms, simple 
types can be translated to types of the simply typed A-calculus of Church 
(see [30], and Section Ab of the Appendix). 


Definition 2.34 We define a type T(t) for each simple type t by induction: 


1. T(0) t; 


2. lt.) 2 Tit) ++ T) 0 
A simple type t of Definition 2.31 has the same interpretation as its 
translation T(t). Moreover, T is injective: 


Lemma 2.35 Ift and u are simple types, then T(t) = T (u) if and only if 
t=u. 


PROOF: Induction on the definition of simple type. W 


Notation 2.36 From now on we will use a slightly different notation for 
quantification in pfs. Instead of Vx[f] we now explicitly mention the type 
(say: t} over which is quantified: Vz:t[f]. We do the same with the trans- 
lations of pfs to A-calculus: instead of Ax.F we write Az:T'(t).F. 


2b1.2 Ramified types 


Up to now, the type of a pf only depends on the types of the arguments 
that it can take. In the Principia, a second hierarchy is introduced by 
regarding also the types of the variables that are bound by a quantifier (see 
Principia, pp. 51-55). Whitehead and Russell consider, for instance, the 
propositions R(a) and Wz:(){z() V >z()] to be of a different level. The first is 
an atomic proposition, while the latter is based on the pf z() V =z(). The 
pf z() V >2() involves an arbitrary proposition z, therefore Vz:()[z() V -z()] 
quantifies over all propositions z. According to the vicious circle principle 
2.1, Yz:()[z() V =z()| cannot belong to this collection of propositions. 

This problem is solved by dividing types into orders (not to be confused 
with the alphabetical order on the variables). An order is simply a natural 
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number. Basic propositions are of order 0, and in Vz:()[z() Vz()] we must 
mention the order of the propositions over which is quantified. The pf 
Vz:()”|z() V =z()] quantifies over all propositions of order n, and has order 
nti. 

The division of types into orders gives ramified types. 


Definition 2.37 (Ramified types) 
1. 0° is a ramified type; 


2. If t{’,...,¢@" are ramified types, and a € N, a > max(aı,...,@n), 
then (t]',...,t@r)” is a ramified type (if n = 0 then take a > 0); 


3. All ramified types can be constructed using the rules 1 and 2. 


If t° is a ramified type, then a is called the order of t°. 


Remark 2.38 In (t{',...,t@")°, we demand that a > a; for all i. This 
is because a pf of this type presupposes all the elements of type ¢;%, and 
therefore must be of an order that is higher than a;. 


Example 2.39 We give some examples of ramified types: 
e 0°; 


e (0); 
© (0), (09*) ; 
+ (0°.0°,(0°,09)”) . 
(0°, (0%,(0%})") is not a ramified type 


Ramified types can also be translated to types of the simply typed à- 
calculus. However, we lose the orders if we do so. 


Definition 2.40 We define a type T(t) for each ramified type t by induc- 
tion: 
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1. T(0°) def L; 


TER EEE ST) 


In the rest of this Chapter we simply speak of types when we mean 
ramified types, as long as no confusion arises. 

In the type (00)*, all orders are “minimal”, i.e., not higher than strictly 
necessary. This is, for instance, not the case in the type (00)?, Types in 
which all orders are minimal are called predicative and play a special role 
in the Ramified Theory of Types. A formal definition: 


Definition 2.41 (Predicative types) 
1. 0° is a predicative type; 


2. If tı®,...,t„°* are predicative types, and a = 1 + max(aı,...,@n) 
(take a = 0 if n = 0), then (¢{’,...,t@")* is a predicative type; 


3. All predicative types can be constructed using the rules 1 and 2 above. 
The mapping T is injective when restricted to predicative types: 


Lemma 2.42 If t° and u? are predicative types, then T(t?) = T(u®) if and 
only if t° = u. 


PROOF: Induction on the definition of predicative type. B 


2b2 Formalisation of the Ramified Theory of Types 


In this section we formalise the intuition on types presented in Example 
2.33 and Definition 2.34 together with the intuition on orders that was 
given at the beginning of Section 2b1.2. Before we can do this we must 
introduce some additional terminology. 

In the pf R(x) we implicitly assume that x is a variable for which objects 
of type 0 must be substituted. For our formalisation we want to make 
the information on the type of a variable explicit. We do this by storing 
this information in so-called contexts. Contexts, common in modern type 
systems, are not used in the Principia. 
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Definition 2.43 (Contexts) Let zj, ‚zn € V be distinct variables, and 
assume ty!,...,t2" are ramified types. Then {2 ,:t{7,...,2n:t@"} is a con- 
text. The set {11,..., Zn} is called the domain of the context and is denoted 
by dom({21:t]',...,2n:t@"}). We will use Greek capitals I’, A as meta- 
variables over contexts. 


The pfs zı(yı) and za(y2) are a-equal, according to Definition 2.18. But 
2 
in a context! = y1:0°, 2;:(0°)', y2:(0°)’, z2:((0°)') \ one does not want 


to see zı(yı) and z2(y2) as equal, as the types of y; and yz differ, and the 
types of zj and zo differ as well. Therefore, we introduce a more restricted 
version of a-equality: 


Definition 2.44 Let I be a context and f and g pfs. We say that f and g 
are ap-equal, notation f =a,r 9, if there is a bijection p : V — V such that 


ə g can be obtained from f by replacing each variable that occurs in f 
by its y-image; 

e x< yif p(x) < ol); 

e vt Er iff y(z)t Eer. 


We will now define what we mean by TH f : t°, or, in words: f is of 
type t°“ in the context T.16 In this definition we will try to follow the line 
of the Principia as much as possible. If IT = Ø then we will write F f : t°. 

We explain some aspects of the following definition in Section 2b3. 


Definition 2.45 (Ramified Theory of Types: RTT) The judgements 
T- f : t° is inductively defined as follows: 


1. (start) For all a: 
Ha:0°. 


For all atomic pfs f: 
ETEO 


16The symbol H in F F f : t° is the same symbol that Frege used to assert a proposition. 
It enters Type Theory in 1934 [36], via Curry’s combinatory logic. Curry defines a 
functionality combinator F in such a way that FXY f holds, exactly if f is a function 
from X to Y. To denote the assertion of FXY f, Curry uses Frege’s symbol +. 
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2. (connectives) Assume [+ f:(t{!,...,¢2)°, AH gebr ute)’, 
and x < y for all x € dom(T) and y € dom(A). Then 


TUAH . [4% jen by bm max(a,b) 
U RER KAREN eee : 


Phafoli et Na 


3. (abstraction from parameters) fT b f: (tP, tr), et} 


is a predicative type!’, ge AUP is a parameter of f, I H g: tn, 

and x < y for all zx € dom(T), then 
Pheer 

Here, h is a pf obtained by replacing all parameters g’ of f which 

are ar-equal to g by y. Moreover, T’ is the subset of the context 

TU {y : tit} such that dom(T’) contains exactly all the variables 

that occur in hl; 


4. (abstraction from pfs) If (¢f',...,t%)* is a predicative type!”, 


TEH f(E, try, x < z for all  € dom(T), and yı < < Yn 
are the free variables of f, then 


1 
TH 2(yiy--- Yn) : (ERE, tem, (RL Com jeje, 


where I” is the subset of TU {z:(t]',...,#@")°} such that dom(T”) = 
{va Doe Ss Uns zys 


5. (weakening) IfT, A are contexts, TC A, and T F f : t*, then also 
AF f:t 


6. (substitution) If y is the ith free variable in f (according to the order 
on variables), and TU {y : tf} Hf: (ti... tar) and hk: tf 
then 

Ib flys=k] : ten)”. 


17 The restriction to predicative types only is based on Principia, pp. 53-54. 
18Tn Lemma 2.56 we prove that this context always exists. 
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Here, b = 1 + max(aı,...,&-1,@i41,---,@n,C), and 
c= max{j | Vz:t? occurs in f[y:=k]} 


(if n = 1 and {j | Yz:t occurs in f[y:=k]} = Ø then take b = 0) and 
once more, I” is the subset of TU {y : t#'} such that dom(T”) contains 
exactly all the variables that occur in f[y:=k]'?; 


7. (permutation) If y is the ith free variable in f (according to the 
order on variables), and TU {y:t#"} + f: (¢f,...,t@)*, and x < y 
for all x € dom(T), then 

dE RE N 
I” is the subset of TU {y:45*, y':t;*} such that domI” contains exactly 


all the variables that occur in f[y:=y’]®; 


8. (quantification) If y is the ith free variable in f (according to the 
order on variables), and TU {y:t#"} + f: (t77,...,t@")°, then 


FE Yill: Greil eerden 


Definition 2.46 A pf f is called legal, if there is a context T and a ramified 
type t° such that TH f : t®. 


2b3 Discussion and examples 


We will make some remarks on Definition 2.45. First of all, we motivate the 
eight rules of 2.45 by referring to passages in the Principia. Then we make 
some technical remarks, and give some examples of how the rules work. 
It will be made clear that the substitution rule is problematic, because 
substitution is not clearly defined in the Principia. 


Remark 2.47 We will motivate RTT (Definition 2.45) by referring to the 
Principia: 
1. Individuals and elementary judgements (atomic propositions) are, 
also in the Principia, the basic ingredients for creating legal pfs;!? 


19 As for individuals: see Principia, +9, p. 132, where “Individual” is presented as a 
primitive idea. As for elementary judgements: See Principia, Introduction, pp. 43-45. 
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2. We can see rule 2 “at work” in *12, p. 163 of the Principia”®: 


“We can build up a number of new formulas, such as |... | 
plz V ply, pla V pla, pla V ply, [...] and so on.” 


(Principia Mathematica, *12, p. 163)) 


The restriction about contexts that we make in rule 2 has technical 
reasons and is not made in the Principia. It will be discussed in 2.49; 


3. Rule 3 is justified by *9-14 and *9-15 in the Principia. It is an instan- 
tiation of the abstraction principles 1.1 and 1.2 for functions that was 
already proposed by Frege. In Frege’s definition one does not have 
to replace all parameters g’ that are ar-equal to g, but one can also 
take some of these parameters. In Section 2d we show that this is not 
a serious restriction. 


The restriction to predicative types is in line with the Principia (cf. 
Principia, pp. 53-54); 


4. Rule 4 is based on the Introduction of the Principia. There, pfs are 
constructed, and 


“the first matrices that occur are those whose values are 
of the forms yz, p(x, y), X(T,y, 2, ), te. where the ar- 
guments, however many there may be, are all individuals. 
Such [propositional] functions we will call ‘first-order func- 
tions.’ We may now introduce a notation to express ‘any 
first-order function.’ ” l 


(Principia Mathematica, p. 51) 


This quote from the Principia is an instance of Frege’s abstraction 
principles, and so is rule 4 of our formalisation. It results in second 
order pfs, and the process can be iterated to obtain pfs of higher 
orders. 

Rule 4 makes it possible to introduce variables of higher order. In 
fact, leaving out rule 4 would lead to first-order predicate logic, as 


20]n the Principia, Whitehead and Russell write plz instead of pr to indicate that pz 
is not only (what we would call) a pf, but even a legal pf. 
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without rule 4 it is impossible to introduce variables of types that 
differ from 0°. 


The use of predicative types only is inspired by the Principia, again; 


. The weakening rule cannot be found in the Principia, because no 


formal contexts are used there. It is implicitly present, however: the 
addition of an extra variable to the set of variables does not affect the 
well-typedness of pfs that were already constructed; 


. The rule of substitution is based on *9-14 and *9-15 of the Principia, 


and can be seen as an inverse of the abstraction operators in rule 
3 and 4. Notice that we do not know yet whether the substitution 
fly:=k] exists or not. Therefore, we limit the use of rule 6 to the 
cases in which the substitution exists. In Section 2c2 we show that it 
always exists if the premises of rule 6 are fulfilled; 


. In the system above, the (sequential) order of the t;s is related to 


the alphabetic order of the free variables of the pf f that has type 
(t1,..-,tn) (see the remark before Definition 2.18, Remark 2.32, and 
Theorem 2.58). This alphabetic order plays a role in the clear pre- 
sentation of results like Theorem 2.58, and in the definition of substi- 
tution (see Remark 2.28). 


With rule 7 we want to express that the order of the t;s in (t1,..., tn) 
and the alphabetic order of the variables are not characteristics of the 
Principia, but are only introduced for the technical reasons explained 
in this remark. This is worked out in Corollary 2.59; 


. Notice that in the quantification rule, both f and Vz:t}*.f have order 


a. The intuition is that the order of a propositional function f equals 
one plus the maximum of the orders of all the variables (either free 
or bound by a quantifier) in f. This is in line with the Principia: see 
[121], page 53. See also the introduction to Definition 2.37, and the 
proof of Lemma 2.60 below. 


Remark 2.48 Rules 3 and 4 are a restricted version of the abstraction 


principles of Frege, with less power. It is, for instance, not possible to 


imitate all the abstractions of Remark 2.10 by using rules 3 and 4 only. 
But in combination with the other rules, rule 3 and 4 are sufficient (see 


2b The Ramified Theory of Types 59 


Example 2.54 for the cases of Remark 2.10, and Section 2d, especially 
Theorem 2.84). 


Remark 2.49 In rule 2 of RTT, we make the assumption that the variables 
of T must all come before the variables of A. The reason for this is that we 
want to prevent undesired results like 


x1:0° F Ra (x1) V Ro(x1) : (0°, 0°)’. 


In fact, R1(x1)VRo(x1) has only one free variable, so its type should be (0°)* 
and not (0°,0°)* (see Example 2.53, second part). For technical reasons 
(the order of the t;*s; see also Theorem 2.58) we strengthen the assumption 
such that for x € dom(T)) and y € dom(A), x < y must hold. 

As Whitehead and Russell do not have a formal notation for types, they 
do not forbid this kind of constructions in the Principia. In 2.82 we show 
that our limitation to contexts with disjoint domains as made in rule 2 is 
not a real limitation: all the desired judgements can still be derived for 
contexts with non-disjoint domains. 


Remark 2.50 In both rules 3 and 4 we see that it is necessary to introduce 
at least one new variable. It is, for instance, not possible to interpret the 
proposition R(a) as a (constant) pf of type (0°)*. This is in line with the 
abstraction principles of Frege and Russell. In Frege’s definition 1.1, for 
example, it is explicitly mentioned that the object that is to be replaced 
occurs at least once in the expression. 

Translated to A-calculus this means that the Principia have Al-terms, 
only. See also Lemma 2.9.3 and Lemma 2.23.3. 


Remark 2.51 Contexts as used in RTT contain, in a sense, too much in- 
formation: not only information on all free variables, but also information 
on non-free variables (Cf. rules 3, 6 and 7. The set of non-free variables 
contains more than only the variables that are bound by a quantifier. For 
example, in the pf z(R(x)), x is neither free, nor bound by a quantifier). 


Remark 2.52 The system is based on the abstraction principles of Frege. 
In a context T', one cannot introduce a variable of a certain type t unless 
one has a pf (or an individual) f that has type t in T. This is different from 
modern, A-calculus based systems, where one can introduce a variable of a 
type u without knowing whether or not there are terms of this type u. 
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We give some examples, in order to illustrate how our system works. 
Example 2.53 shows applications of the rules. Example 2.54 makes a link 
between the intuitive notion of abstraction that was explained in Remark 


2.10 and the abstraction rules 3 and 4 of our system. 


X-X 
We will use a notation of the form In indicating that from the 


judgements X1,...,Xn, we can infer the judgement Y by using the RTT-rule 
of Definition 2.45 with number N. As usual, this is called a derivation step. 
Subsequent derivation steps give a derivation. A derivation of a judgements 
Y is a derivation tree with Y as root (the final conclusion). The types in 
the examples below are all predicative (as a pf of impredicative type must 
have a quantifier, and the examples below are quantifier-free). To avoid too 
much notation, we omit the orders. 


Example 2.53 
e | S(a1,a2) : (); 


è H Ri (az) : () H Ra(a1) : 0, 
FRı(aı) V Ro(as) : () 
but not: 
xı : OF Ry(x,) : (0) xı : OF R2(x1) (0), 


X1: OF Ri (x1) V Ro(x1) : (0,0) 


(xı £ xı because < is strict). To obtain Rı(xı) VRa(xı) we must make 
a different start: 
F Ry(ai) : () F R2(a1) : 0, 
F Ri (a1) VRo(ar) : 0 Ha, :0 ag 
X1 : 0 F Ri (x1) V Ro(x1) : (0) i 
x1:0,x2:0,21:((0),(0)) F zi(R(xi),R(xo)) : (((0), (0))) 
x1:0,x2:0,21:((0),(0)) F R(x1) : (0) 
21: ((0), (0)), za : (0) A Zı(Z2, za) : (((0), (0)), (0)) 


As R(xı) is a-equal to R(x2) in the context, both R(xı) and R(x2) are 
replaced by the newly introduced variable z3; 

xı :0,x2 : 0 F S(x1,x2) : (0,0) f 
X1 : 0, x2 : 0,z : (0,0) A z(x1,x2) : (0,0, (0,0)) i 
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xX, : OF Ri(x1) V Ro(x1) : (0) Ha, :0 
H Ri (a1) V R2(a1) > () 
X1: OF Ri (ai) V R2(a1) 7 () 


6; 


X1:0,x2:0,x3: (0,0) F R(xı) V 7x3(x1,xX2) : (0,0, (0, 0)) 
X1:0,x2:0 HF T(x1,x1,x2): (0,0) 
X1 : 0, x2 :0F R(xı) Vv T(x1,%X1,X2) : (0,0) 


T(x1, X1, X2) is substituted for x3. 


Example 2.54 We give a formal derivation of the examples of the abstrac- 
tion rules that were given in Remark 2.10. Again, we omit the orders. 


e Constructing z(a) V S(a) from R(a) V S(a) cannot be done with the 
use of rule 4 only. The following derivation is correct: 
Ha:0 + R(a):() 
F a:0 x:0 F R(x):(0) 4 
z:(0) F a:0 x:0, z:(0) + z(x):(0, (0)) 6 
z:(0) F z(a):((0)) H- S(a):() 
z:(0) + z(a) v S(a) : ((0)) 
To obtain z(a) instead of z(), we must transform R(a) into a pf R(x) by 
abstracting from a. Then we can construct z(x) by abstraction from 
pfs (rule 4). In this way, the “frame” for z(a) is of the right form. 
Substituting a for x gives z(a) (and “neutralises” the application of 
rule 3 at the top of the derivation). Simply applying rule 4 on the 
judgement + R(a) : () does not work: it results in z() HF z() : (()); 


2. 


e Constructing zı()Vz2() is easier: zı() can be obtained by abstracting 
from R(a), and z2() similarly from S(a). Result: 


Era) , F S(a):() 
2: Fzal):(0) Za:() F 220):() , 
z1:(),22:()F ZA Vz2(): (0,0) 
We see that in fact two abstractions are needed to construct this pf: 
we must abstract from R(a) as an instance of the pf zı(), and from 
S(a) as an instance of the pf z2(). As rule 4 does not work on parts of 
pfs, these abstractions have to be made before we use rule 2. Applying 
rule 4 on F R(a) v S(a) : ()° would result in z : ()F z(): (O); 
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e We can extend the derivation of z1:(),z2:() F zı() V z2() : ((),()) to 
obtain a type for z(R(a), S(a)): 


%1:(), X2:() F x10) V x20) : (0,0) 2 
x1:(), 22:0), z:(0, 0) F z(x1, x2):(0, 0, (Q, a 6 
72:0 z:((), ()) F 2(R(a), x2) : (0,(0,0) 5 
z:((),()) F 2(R(a), S(a)) : (0, 0)) 


(for reasons of space, we omitted the premises z:((), ()), x2:() F R(a):() 
and z:((),()) F S(a):() of the first and second application of the sub- 
stitution rule); 


For the derivation of the type of z(R(x), S(a)) we first make a deriva- 
tion of the “frame” z(y1, y2) of this pf: 


Fa:0 F R(a):() . 


F a:0 x:0 F R(x):(0) 4 
y1:(0) FH a:0 x:0, y1:(0) F yi(x):(0, (0)) 6 HF R(a):() 4 
y1:(0) F y1(a):((0)) yz:0Fy20:(0) , 
y1:(0), y2:() F y1(a) V y2() : ((0), 0) 4 


y1:(0), y2:(), 2:((0), 0) F z(y1, y2) = (0), 0, (0), ())) 


Then we derive x:0 H R(x):(0) and F S(a):(), and after applying the 
weakening rule, we can substitute R(x) for yı and S(a) for ys. Asa 
result, we get 


z:((0), ()),x:0 F z(R(x), S(a)) : (0), 0)). 


Example 2.55 In the example below, the orders are important: 


HR(a) : ()° 
FR(a):Q® Fra): 0°” 


F aR(a) : () 2 
F R(a) V =R(a) : ()° 


z z0 vz (OD 


F ¥z:()°[2() Vz()]: 0! 


We see that Vz:()°[z()° V ~z()] does not have a predicative type. This is 
the case because this pf has a bound variable z that is of a higher order 
than the order of any free variable (as there are no free variables here). 
Therefore, the order of this pf is determined by the order bound variable z. 
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We still need to prove that the contexts in the conclusions of rules 3, 4 
and 6 exist. This follows from the following Lemma: 


Lemma 2.56 Assume [+ f:t®. Then 


1. (Free variable lemma) All variables of f that are not bound by a 
quantifier are in dom(T); 


2. (Strengthening lemma) If A is the (unique) subset ofT such that 
dom(A) contains exactly all the variables of f that are not bound by 
a quantifier, then AF f : t°. 


PROOF: An easy induction on the definition of TF f:t°. K 


2c Properties of RTT 


2cl Types and free variables 


In this section we treat some meta-properties of RTT. Using the A-notation 
for pfs, we can often refer to known results in typed \-calculus”!. 


Theorem 2.57 (First Free Variable Theorem) 
Let f € P; ki... kn E AUVUP. 


FV(f[£1;..-, kiek = 
(FV(f) \ {m1 en}) U {ki EV | z; € FV(f)}. 


Proor: Write h = flaı,..-,‚Zn:=kı,..-‚Kknl-_ We know that h is the 
G-normal form of the A-term (AZ, zi.f}kı tkn. By Lemma 2.23.3 and 


Lemma 2.9.4 we know that f,kı,..., kn are all Al-terms. We conclude that 


Fleı:=kıl-- [en:=krl is also a Al-term. As 


(A aef) ki: j Kn 8 fleı:=kı] er len:=kn] 


21’Phe meta-properties can also be proved directly, without A-calculus: see [79]. 
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we have by the Church-Rosser Theorem that f[xı:=kı]-- - [£n:=ka] > h, 
and therefore: 


FV(h) FV(h) | 
Ee) 
= (rv(F) \ {r1,..-,¢n}) UL {rv (ki) | x: € rv(f)} 
@ (rv(f) Hau. tah) UCR EV |si € Fv(f)} 
(2.23.1) 


Er) an. sn) U {ki € V | zi € rv(f)}. 


At (1) we use that fleı:=kı] + [£n:=kn] is a Al-term that 6-reduces to h; 
at (2) we use the fact that FV(k;) = Ø whenever k; € AUP (by definition 
of ki). R 


Theorem 2.58 (Second Free Variable Theorem) Assume that we can 
derive Hf: (t]1,...,t8")*, and £1 <--- < Tm are the free variables of f. 
Then m =n and z; : tf ET for alli <n. 


PROOF: An easy induction on DH f : (¢{?,...,¢4")*. For rules 6 and 7, use 
Theorem 2.57. & 


We can now prove a corollary that we promised in Remark 2.47.7: 


Corollary 2.59 TH f :(t{',...,t8")* and is a bijection {1,...,n} — 
{1,...,n} then there is a contert I’ and a pf f! which is ap-equal to f such 
that 8 
I 1, (4%) p(n) 

PEF as) l 
PROOF: By the second Free Variable Theorem, we can assume that f has 
n free variables zj < +- < £n, and that z‚:t;’ ET for all i € {1,...,n}. 
Take n new free variables zj < --- < Zn such that z; > y for all y € dom(T'). 
Now apply rule 7 of RTT n times. & 


We can also prove unicity of types and unicity of orders. Orders are unique 
in the following sense: 
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Lemma 2.60 Assume TH f:t. Ifx occurs in f and x : u? ET, then u? 
is predicative. Moreover, if also T + f:t*, thena =d. 


PROOF: By induction on the derivation of [ + f : t® one shows that a 
variable x that occurs in f always has a predicative type in [, and that 
both a and a’ equal one plus the maximum of the orders of all the (free and 
non-free) variables that occur in f. R 


Corollary 2.61 (Unicity of types for pfs) Assume T is a context, f 
isapf, DH f:t" andltH f: ue. Then t° = u. 


PROOF: t = u follows from Theorem 2.58; a = b from Lemma 2.60. Ñ 


Remark 2.62 We cannot omit the context I in Corollary 2.61. For ex- 
ample, the pf z(x) can have different types in different contexts, as is illus- 
trated by the following derivations (we have omitted the orders as they can 
be calculated via Lemma 2.60): 


FR(ai):() Fao 
x: OF R(x) : (0) 
x: 0.2210) F z(x): (0, (0)) 


3 


versus 
H R(a,) s QO 


Orr: , 
x: 0,2: (0) F 20) : (0,0) 


Theorem 2.58 and Corollary 2.61 show that our system RTT makes sense, 
in a certain way: The type of a pf only depends on the context and does 
not depend on the way in which we derived the type of that pf. 

As a corollary of 2.61 we find: 


Corollary 2.63 TH f:t, Tbk: u®, zu eTandTt flx:=k] : ge’ 
then a > a'. 


PROOF: If « g Fv(f) then f = f[z:=k] and the corollary follows from 
Unicity of Types 2.61. Ifz € Fv(f) then the variables that occur in f[z:=k], 
occur either in f or in k, and as the order of k is smaller than the order 
of f (x € Fv(f), so b < a), the corollary follows from the proof of Lemma 
2.60. X 
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2c2 Strong normalisation 


We investigate the problem whether there exists (in the situation of Defi- 
nition 2.45.6) a pf h such that k = f|y:=k]. We show that this is the case, 
in Corollary 2.73. 

The nonexistence of f [&=K) can have two reasons: 


e The A-term (Af zi.f ki En has no ß-normal form; 


PO man 


e The A-term (A, m. f)kı - - kn has -normal form H, but there is no 


h € P such that h = H. 

However, these two things do not occur if we use substitution under the 
restrictions of Definition 2.45.6. For a proof of this we use the simply typed 
A-calculus of Church [30]. This is not only of help for the existence-proof 
of h, but also shows that RTT can be seen as a subsystem of A—. However, 
we remark that the orders of RTT are lost in the embedding to A—. A 
definition of Church’s calculus is given in Section Ab of the Appendix. We 
translated the propositional functions and the ramified types of RTT to 
terms and types of A— in Definitions 2.7, 2.21, and 2.40. We now extend 
the mapping T of 2.40 to contexts: 


Definition 2.64 We define a standard context To in A, in which type 
information on V, ~, V and elements of A and R is stored: 


Ty € {7:040,V:070>0}U 


{a:ılae Alu 

{Vie : T(t?) — o | t° is a ramified type} U 

{R:t>...7470|RER,a(R) =m}. 
m times t 


If T is a context in RTT then T(T) © To U {x : T(t?) | xt? ET}. 


In particular, T(&) = To. 
Theorem 2.65 /fT E f:t" then 
1. PD f:o; 
2. T(Ø) Ky 7: T(t). 
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PROOF: A straightforward induction on’ f : t* with the use of Theorem 
2.58 and the Subject Reduction property for A-Church (A.30). B 


Remark 2.66 Observe that the above theorem immediately excludes the 
pf that leads to the Russell Paradox from the well-typed pfs: If -z(z) were 
legal then the A-term —(zz) would be typable in A-Church, which is not the 
case (see [5]). 


Using the strong normalisation of A-Church (A.36), it is easy to solve the 
first problem: 


Theorem 2.67 Takei <n. Assume TU {yt} Hf: (tP, t) and 
Ph k:t (so: the preconditions of rule 6 of RTT are fulfilled). Then 
(y:T(t#).f)k is strongly normalising. 


PROOF: The theorem is easy for k € A, so assume k € P. By Theorem 
2.65.2 we know that T(@) + k : T(t*), hence T(T) F k: T(t}*) by weak- 
ening. As, by Theorem 2.65.1, T(T) U {y:T (t7 )} F f : o and therefore 
T(E) E Ay:T(t%).f : TOF) — o, we have T(T) + (Ay: T(t).f)R : 0; so the 
term (Ay:T (4%). f YR, being a typable term in A—, is strongly normalising. 
& 


The second problem is harder to tackle: substitution (Definition 2.24) 
is defined in terms of A-calculus, and not every A-term H has an equivalent 
hin P with h = H. This makes it hard to see what happens, especially in 
case of a substitution 


Arena: 


For this substitution we must calculate the -normal form of 


(ALITI Tee Ent Taha Am)kıı kr Km. 


This term reduces to kH,---H, for some ß-normal H;s. If k € P then 
this new term may not be in -normal form, and it is not clear what will 
be the final result (cf. Examples 2.25.4 and 2.25.5.). 

The problem clearly has to do with the special structure of A-terms 
H for which there is a (legal) h € P with h = H. Such terms h have 
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one important property: All variables are either arguments of functions, 
or they are applied to the maximal number of arguments that is possible 
according to the type of that variable. For instance: if a term h is of the 
form zH;--- Hm, then the type of z will be of the form 71 + --- + Tm — 0. 

We call such terms fully applied and give the following formal definition: 


Definition 2.68 (Fully applied A-terms) Let T be a A—-context, and 
let M be a I'-legal term of type t. Write M = MoM:--- Mm, where Mo is 
either a variable or a term of the form Ax:r.Mg. We define the notion M 
is T'-fully applied by induction on the length of M: 


e If Mo is a variable then M is T-fully applied if M has type o in F, 
and for 1 < i < m, either M; is T-fully applied, or M; is a variable; 


e If Mo = Az:r.Mg then M is T-fully applied if Mj is (T,x:r)-fully 
applied, and for 1 < i < m, either M; is T-fully applied, or M; is a 
variable. 


If it is clear which context I is used, we just write fully applied instead of 
T -fully applied. 


It will be shown that for each legal propositional function f, f and 
f are fully applied. This can be done by induction on the derivation of 
TF f : ¢*. For the substitution case, we need some additional properties of 
fully applied terms. 


Lemma 2.69 If M is (T,y:r, A)-fully applied, and N is (T,A)-fully ap- 
plied, then M[y:=N] is (T, A)-fully applied. 


PROOF: Induction on the structure of M. 


eM=2M,:::-Mm. 
-x = y. Then M[y:=N] = NM,[y:=N]---Mmly:=N]. Distin- 
guish: 
* N=zN,---Nn. Notice that I,A F N : o. This means that 
T,y:r, Ab y : o, and therefore m = 0 and M[y:=N] = N, 
thus: M[y:=N] is fully applied; 
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* N = (Ar:w.N’)Nı---N„. As N is fully applied, N’ is fully 
applied, and the N;s are either variables or they are fully 
applied. By induction, M;ly:=N] is either a variable or fully 
applied for 1 <2 < m. This means that 


(Arz:v.N’)N, >>> ,Mily:=N]--- Mmly:=N] 


is fully applied; 

-xz # y. By induction, M;[y:=N] is either a variable or fully 
applied for 1 < i < m. By the Substitution Lemma (A.26), 
£M,[y:=N]---M,,[y:=N] has type o in (T, A). This means that 
zM;|y:=N]--- Mm|y:=N] is fully applied; 

e M = (Arv.M')Mi--- Mm. By induction, M'[y:=N] is (T,A,r:v)- 
fully applied, and M;|y:=N] is either a variable, or fully applied. 
Therefore M [y:=N] is fully applied. 


& 


Lemma 2.70 If M is T-fully applied and M >; M', then M' is T-fully 
applied. 


PROOF: Induction on the structure of M. 


eo M = rMi---M. The reduction must occur in a term M;, say: 
M;i >, Mi. As M; cannot be a variable, M; is fully applied. By the 
induction hypothesis, M; is fully applied. M has type o, so by Subject 
Reduction A.30, M’ has type o. Hence M’ is fully applied; 


e M = (Ar:v.Mo)Mı Mm: If the reduction occurs within Mı,..., 
Mm or Mo then we can give a similar argument as in (1). Now assume 
M' = Mole:=Mi|M2--- Mm. Observe: Mo is (T,x:v)-fully applied. 
If M, is a variable, then clearly Mo[x:=Mı] is T-fully applied. If 
Mı is not a variable then Mı is T-fully applied, so by Lemma 2.69, 
Mo{xz:=Mı] is T-fully applied. Distinguish: 

— Molz:=M}| = yN1-:- Nn. As Mole:=Mı] is fully applied, it has 
type o. This means that m = 1, and that M’ = Mo[r:=Mj] is 
fully applied; 
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— Mole:=Mı] = (Ay:7t.N)N1--- Nn. As Mo[e:=Mı} is fully applied, 
N is fully applied, and the N;s are either variables or fully ap- 
plied terms. Therefore M’ = (Ay:7.N)N,---N,M2--: Mm is fully 
applied. 


Now we can prove: 


Lemma 2.71 Let f €P. If f:t then f is T(T)-fully applied, and F 
is T(@)-fully applied. 


PROOF: We prove that f is T(T)-fully applied. Then it easily follows that 
f is T(@)-fully applied. We use induction on the derivation of F t°. All 
cases are easy to check, except for the abstraction-from-parameters and the 


substitution cases: 


3. (abstraction from parameters) We use notations as in Definition 2.45.3. 
By induction on the structure of f it is shown that if f and 9 are fully 
applied, then # is fully applied; 

6. (substitution) We use notations as in Definition 2.45.6. Let h = 
fly:=k]. Ifk € A then h is just f in which all free occurrences of 
y have been replaced by k. It is then easy to see that h is fully ap- 
plied. Now suppose k € P. By the induction hypothesis, f and g 
are fully applied. This means that (Ay:f)g is fully applied. This term 
ß-reduces to k. By Lemma 2.70, h is fully applied. 


Z 


We now know that each legal pf f gives rise to fully applied A-terms f 
and f. This is of great help in showing that substitutions always exist in 
case of the application of rule 6 of Definition 2.45. We first show that each 
substitution in RTT that gives rise to a -reduction path starting with a 
fully applied A-term, really exists. Then it is easy to show that substitutions 
always exist in the situation of Definition 2.45.6. 


Lemma 2.72 Let f € P, let ki,...,kn € AUVUP, and assume that 
(Afi tiiti- f ki ee kn is T(T)-fully applied, where T is a RTT-contezt. Then 
there is h € P such that h = fla,...,2n:=ki,..., kn]. 
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PROOF: Notice that (Af; ajit;.f)k, --- km is a legal term of A—, and there- 
fore strongly normalising. Let q be the length of the longest reduction path 
of this term. Use induction on q, so assume that the lemma has been proved 
for all q’ < q (write IH1 for the induction hypothesis). We use induction 
on the structure of f (and write IH2 for this induction hypothesis). Some 
cases can be handled directly, for other cases we need help of Lemma 2.70. 


1. f = Rlü,... »ta(R))- Define 


a 


ke if 1; = Ip, 
3 


i; ifi; {aren}. 


Observe that i; € AUV: otherwise there would have been kg € P 
such that i; = xg. But as R has types — ... + t — 0, ze must have 
type ı, and therefore kg has type ı, which means that ke cannot be a 
pf (Theorem 2.58, Theorem 2.65). 

Let f' = Riom): As none of the is is a pf, we have f’ € P. 
Observe: 


(A estf) kiek B fi, 


so f[xz1,...,;2n:=ky,...,kn] exists and is equal to f’; 


2. f = fi V fo. Notice: f is fully applied and f = Vfifa, so fi and h 
are fully applied. Therefore, (Af zi:ti. fj)ki «++ kn is fully applied for 
j = 1,2. By the induction hypothesis??, there are hı, h2 € P such 
that 


filz,- ok Erk , kn] = h; 


for 7 = 1,2. This means that 
n ~\ __ = ~~ 
(3 wets) ki: kn —g Vhıha 
= 
and therefore f[21,...,2n:=ki,..., kn] = hi V ha. 


A similar proof can be given for f = =f’; 


22 Observe that the longest reduction path of (Azı:tı --- En:tn. fj )ki -- kn has a length 
<q. If the length is equal to q then use IH2; otherwise use IH]. 
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3. f = Vz:t*[f']. Notice that F = Via (Az: T (t°). FÀ. As f is fully de 


f’ is fully applied as well. This means that (Af; ziti. F kie kn is 
fully applied. By the induction hypothesis??, there is h € P such that 
F'[zi,.--, 2n:=k1, . -., kn] = h. This means that 


(3 wtf) ki kn 8 Vioda:t* h 
and therefore f[xı,...,£n:=kı,..., kn] = Vz:t*[h]. As f, ki, sake kn are 


all Al-terms, x € Fv(h), so x € FV(h), which means that Vz:t°[h] is 
indeed a pf; 


. f = zfhı,...,hm). If z ¢ {xı,...,&n} then we can give a proof similar 


to the case f = R(i1,...,ia(r)). Now assume z = zp. Define 


h! ={ ke if hj = zg; 
hy Ah € {21,.-., 2n}. 


Notice that fis fully applied. As f starts with a variable (this is due 
to the definition of f), it has type o. Therefore, 


n ~\ — 
(3 vif ky--+kp 


has type o as well. Observe: 
(3 vite) ky +++ kn —g kph hi. 


Write K = kohi --- hi. K is fully applied (Lemma 2.70), and has type 
o (Subject Reduction A.30). Observe that kp = = N U: kp- Hence 
kp is fully applied. By definition of kp, kp starts with a variable. 
Therefore, kp has type o. As K has type o as well, we have that 
q = m, and that K represents the substitution 
kplyi, -< Win Mise taalbe 
The longest reduction path of K is shorter than the longest reduction 


path of 
(3, zet) ki km 
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so we can apply the induction hypothesis IH1 and conclude that there 
is h € P such that kply1,...,ym:=h},...,h},] = hk. But then also 
flsı, Se ‚En:=kı, pele kn] =h. 


With this lemma it is easy to show that substitution always exists in 
the case of RTT-rule 2.45.6. 


Theorem 2.73 (Existence of substitution) If f € P, y is the ith free 
variable in f, TU{y: tf} Hf: (t7,...,t0")*, and PEK: t°, then f[y:=k] 
exists. 


PRooF: Notice that f and k are fully applied. Therefore (Ay:T (t$). f)k is 
fully applied. By Lemma 2.72, f[y:=k] exists. X 


2c3 Subterm property 


The technique of fully applied A-terms that was used in Section 2c2 to prove 
the existence of substitution can also be used to prove another important 
property of type systems for RTT: the Subterm Property. This property 
states that if a propositional function is typable, then its recursive param- 
eters (see Definition 2.17) are typable as well. If all recursive parameters 
of a legal pf f are typable, we say that f has the subterm property: 


Definition 2.74 AssumeI F f : t°. If for all h € RP(Í)NP there is A DT 
and a predicative type u? such that A F h : u?, then f has the subterm 
property. Notation: SP(f). 


Just as in Section 2c2, we prove by induction on the derivation + f : t° 
that all legal pfs have the subterm property. Again, all cases are easy, 
except for the substitution rule 2.45.6. This case can be solved using similar 
techniques as in Section 2c2. 


Lemma 2.75 Let f € P, let kı,...,kn € AUVUP, and assume that 


(AR; vi:ti-f)ki---kn is fully applied with respect to T(T), where T is a 
RTT-contezt. If sP(f) and sP(k;) for all k; € P, then 
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1. nP( f[@=R]) C RPCP) U {ki | 1 < i < n} U Urep RP(ki); 


PROOF: Clearly, (2) follows from (1). We prove (1) by induction on the 
length of the reduction path of (A; zi:t;.f)kı kn. We use induction on 
the structure of f, and only treat the interesting case: f = z(hı,..., hm), 
and z = Tp. As in the proof of Lemma 2.72, we define 


hh 2 ke if hj = Tg; 
iT | hj Ah, g loten 


and prove that f[xı,... ‚Zn:=kı,..., kn] = kolyı,..- »ym:=hl,... hin]. As 
the reduction path of (AJL, yj:uj.kp) Ay + hl, is shorter than the reduction 
path of (AR, ziti. f )kı +- kn, we can use the induction hypothesis:?3 


re (flik) = RP (kpli:=A}) 
C RP(k)U{h II Sj Sm}u ll rp (h}) 
h; EP 
C RP(f)U{ki|1<i<n}u |) RP(k:). 
kiEP 


As a corollary we get: 
Corollary 2.76 (Subterm Lemma) I/fTH f : t° then sP(f). 


PROOF: Induction on T F f : t°. All cases are easily checked except for the 
substitution rule, which is proved with Lemma 2.75. K 


?3Note that RP(kp) C U, ep RP(ki), {hj |1 < j < m} C RP(f)U {ki |1 <i Sn} and 
Uwer RP(h;) e RP(f) U User RP(k;). 
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2d Legal propositional functions 


We recall Definition 2.46: a pf f is called legal if T + f : t° for some T 
and °. We will check whether this definition of legal pf coincides with the 
definition of formula that was given in the Principia. For this purpose we 
prove a number of lemmas concerning the relation between legal pfs and 
predicative types. 

We do not distinguish between pfs that are ap-equal, nor between types 
(tis... stn) and (t,(1),---5 tyn)) for a bijection p. This is justified by Corol- 
lary 2.59 and by the fact, that pfs that are ap-equal are supposed to be 
the same in the Principia too. 

We define the notion “up to ap-equality” formally: 


Definition 2.77 Let f € P, T a context, t° a type. f is of type t° in 
the context T up to ap-equality, notation T H f : “(mod ap), if there is 
f'€ P, a context I” and a bijection y : Y — V such that 


e IF Fte 
e f' and f are ap-equal via the bijection y; 
e I’ = {y(z):u? | aru? € TH. 


We say that f is legal in the context [ up to ap-equality if there is a type 
u? such that TH f : u’(mod ap). We say that f is legal up to ap-equality 
if there is a context F such that f is legal in T up to ap-equality. 


The following lemma states that all predicative types are “inhabited”: 


Lemma 2.78 If t® is predicative then there are f, T such that T+ f : t°. 


PROOF: We use induction on predicative types. 
The case t = 0° is trivial. 


Now assume t = (t{',...,t@”)°. By induction there are f; and F; such that 
T; b fi: ti for alli <m. Take a fixed i. We shall find a context A; and a 
legal pf gi such that A; F gult a! Distinguish two cases: 


e t° = 0°. Then make the following derivation: 


rit R(f;) : 0° T; hf; : 0° 
T's, 2:09 H R(z;) $ (0°)* 
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Write A; =T; U {z;:0°}, and g; = R(z;), then A; Hg: (1) ir, 


o t; # 0°, say ti = (ul! …, ur)”. Because of Theorem 2.58, 5 has n 
free variables, say £1 < +-+ < £n, such that sju EI; Now use rule 
4 of RTT: 
Eh fiit? 
f end (1) 
T; E Bies (abs, yee ter ghey) 


where T; = {zju |1<j<n}U{z:t}. Use rule 8 n times: 
Lt} Veriut! |- Veniutr [zien En) e]: aad 
Write A; = {z;:t/*} and 


gi = Va, zul! [-- Waniule RER 25) KEEN 


For arbitrary 7 = 1,...,m we now have: A; F gi: Gar 


We can assume that x < y for x € dom(A;), y € dom(A;) with i < j. 
Write A = A; U... U Am. Now apply rule 2 consecutively m—1 times, to 
obtain: 


fam peen(ianen) El, 


Bik gy Vide V... V(g9m-1VY9m)--.): 0P 
Notice that (t{!,...,¢@)® is predicative, so a = max(a1,...,@m) +1. X 


Remark 2.79 From a modern point of view, this is a remarkable lemma. 
Many modern type systems are based on the principle of propositions-as- 
types (see Chapter 4). In such systems types represent propositions, and 
terms inhabiting such a type represent proofs of that proposition. In a 
propositions-as-types based system in which all types are inhabited, all 
propositions are provable. Such a system would be (logically) inconsistent. 
RTT is not based on propositions-as-types, and there is nothing paradoxical 
or inconsistent in the fact that all RTT-types are inhabited. 


This lemma can be generalised to some non-predicative types: 


Corollary 2.80 If (t{?,...,t#)* is a type such that the tọ are all pred- 
icative, then there are f and T such that TH f : (tP,.. . „tem )®, 
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PROOF: With Lemma 2.78 we can construct gı and A; such that 
Be bop Gite a m)+L, 


Let u? be a predicative type of order a. Determine, again with Lemma 
2.78, g2 and Az such that Az go: uf. Assume zj < ++- < zn are the free 
variables of gz and zj} € Az. Notice that u® = (u®,... ‚udr)” (Theorem 
2.58). Apply rule 8 and weakening n times to obtain: 


Ag H Yaru? f- -Yenu [ga] ---] : 0°. 


We can assume that x < y for all z € DOM(A;) and all y € DOM (Az), so 
we can use rule 2 to conclude: 


A, U Az gy V Vaut! [Venut [go] ---] : east) 
X 
We can use the same techniques as in the preceding proof to show that 
z(kı,...,km) is legal if kı,..., km are either legal pfs or variables, and z is 
“fresh”. 


Lemma 2.81 Ifkı,...,kn € AUVUP, t° = (tj! tar)” is a predicative 
type, Th ki: t for all k; € AUP and ki : ti ET for all k; € V, and 
z € V\dom(T), then z(kı,...,kn) is legal in the context T U {z : t°} (up to 
ap-equality). 


PROOF: First, we make a derivation of 
{2:t9} U {ect | 1<i<n} H 2(a1,...,2n) : (#%,..., #9", t7)°T (mod ap), 


similarly to the derivation of (1) in the proof of Lemma 2.78. Next, find 
(with Lemma 2.78) kj, „kh such that 

e k; = ki if k; € AUP; 

e k; € AUP has type t' in a context A; if k; € V; 


e A, the union of the contexts T and the A;s, is a context; 
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e For k; € V: k; and k; are ar-equal if and only if k; = hes 


Apply rule 6 n times (as in the proof of Lemma 2.78), and where necessary 
the weakening rule, to obtain: 


(zit?) UA F 2K), orn kh) (4) (mod ap). 


Now introduce, with rule 3, new variables for the k; which are not equal to ` 
ki, to obtain a legal pf that is ap-equal to z(ki,...,km). & 


It is also not hard to show that f Vg is legal if f and g are (see also Remark 
2.49): 


Lemma 2.82 Iff and g are legal in contests Tı and T's, respectively, and 
Tı UT% is a context, then f Vg is legal in the context Tı UT (up to ap- 
equality). 


PROOF: For reasons of clarity, we again leave out the orders of the ramified 
types. We can not simply apply rule 2 of RTT, as the contexts Ty and 
T's may not obey to the condition on them in rule 2. Assume I‘; F f: 
(ti;--- stm), FE (u1,..., Un), and z1 < ++- < £m and yy < +: < Yn 
are the free variables of f and g, respectively. Write t = (tı,...,tm) and 
u = (u1, ..- Un). Take variables z} <... £a < 21 < y} <: < yh < za not 
occurring in the domain of Ty U I's; let 


Ai = EI RE ER 
A: = {z2:u, yı:uı, sie Yn:Un}- 
Similar to the derivation (1) in the proof of Lemma 2.78, we can derive 
A1 F z(t. ha) i (ti-e tm t); 


Aa Hzolyi, Yn) : (wr, Un, 4). 
As A, and Az obey to the conditions of rule 2 of RTT, we can derive 


A1 U A2 Fam.) Va les): 


One might wonder whether there are enough pfs of one type that are not ap-equal. 
Lemma 2.78 provides only one pf for each type. But if we have that pf, say k, then we 
use rule 2 of RTT to create =k, ==k, 477k, etc. k, ak, ook, ... are all ap-different and 
of the same type as k. 
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With similar techniques as in the proof of Lemma 2.81 we can now derive 
Di UI U {z1:t, zzu} F zion, , Em) V 22(y1,--+,Yn) : v(mod ap) 


for a certain type v (notice that the sets {x1,..., £n} and {y1,...,34,} do 
not overlap, whilst the sets {£1,..., £m} and {y1,...,yn} may overlap). 
Use rule 6 twice: Substitute f for zj and substitute g for zo. This gives a 
derivation of f V g in the context Tı UT, (mod ap). R 


The following lemma is easy to prove and will be used in the proof of the 
main result of this section. 


Lemma 2.83 If Ri, tar) is a pf with free variables £y < ++: < Tm, 
then it is legal in the contest {x;:0|1 <j <m}. 


PROOF: Write f = R(t,...,%4¢R)). Let a1,...,@m € A be m different 
individuals that do not occur in f, and replace each variable x; in f by 
a;, calling the result f’. By the first rule of RTT, f’ is legal in the empty 
context. Re-introducing the variables 21,..., £m (by applying rule 3 of RTT 
m times) for the individuals aı,...,@m, respectively, we obtain that f is 
legal in the context {z;:0| 1 <j <m}. R 


Finally, we can give a characterisation of the legal pfs: 
Theorem 2.84 Let f eP. f is legal (mod ap) if and only if: 


hd f = Rliı,. oe »ta(R))> or 


e f =2(ki,...,kn), z £ kj for all k; E€ V and z does not occur in any 
k; € P, and there is T erh rv(f) C DOM en and for all k; € P, 
r : kit; “i for some predicative type tj KA 


e f=~f' and f' is legal (mod ap) or 


e f = fi V fa and there are T; and t}‘ such that T; b fitit (mod ap) 
fori=1,2 and Ti UT’ is a contezt, or 


e f =Va:t*.f' and f’ is legal. 


Proor: Use induction on the structure of f: 
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e f= R(i,..., tn). This is Lemma 2.83; 
e f = z(ky,...,kn). “=” is Lemma 2.81. “=”: f is legal, so there 


is with T H f:t%. As T(D) ta zkı---kn : o (Theorem 2.65), 
z : (ui? ub) for a predicative type (uj!,..., ue)’, and T(T) Ha 
kj: T(u?). If z =k; then z = k; and therefore ziu” € T, which is 
impossible. By Corollary 2.76, each k; € P is typable in I’, and as 
PE) Fis. kj: Tu?) and the type of k; is predicative, T + kju. 
Notice that b; < b, so it is impossible that z occurs in a k; € P; 


e “=” is Rule 2 of RTT (for =) and Lemma 2.82 (for V). “=™ (for V; 


the proof for — is similar): Let A be the context containing all the 
variables of f (also those that are bound by a quantifier; we can assume: 
that different quantifiers bind different variables) and their types. f is 
built from several pfs of the form R(1,...,%a¢z)) and z(k1,.-., km) (we 
will call these pfs the constituents of f), and the logical connectives 
=, V and Y. Reasoning as in the “=>” part of the first two cases of 
the proof of this lemma, we can show the preconditions for Lemma 
2.83 (for constituents of the form R(ij,...,ia(r))) and Lemma 2.81 
(for constituents of the form z(kı,...,km)). Applying these Lemmas, 
we find that any constituent A of f is typable in A. Using Rule 2 of 
RTT (for =), Rule 8 of RTT (for Y) and Lemma 2.82 (for V), we find 
that fı V fa itself is typable; 


e “<=” is Rule 8 of RTT. “=” is similar to “>” in the previous case. 


We can now answer the question whether our legal pfs (as defined in 2.46) 


are the same as the formulas of the Principia. 


First of all, we must notice that all the legal pfs from Definition 2.46 


are also formulas of the Principia: This was motivated in Remark 2.47. 


Moreover, we proved (in 2.84) that if f is a pf, then the only reasons 


why f cannot be legal (according to Definition 2.46) are: 


e There is a constituent z(k,,...,4m) of f in which z occurs in one of 
the k,’s; 


e There is a constituent z(kı,...,km) of f and aj € {1,...,m} such 
that k; is a pf, but not a legal pf; 
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e f contains two non-overlapping constituents fi, fz that cannot be 
typed in one and the same context. 


Pfs of the first type cannot be legal in the Principia, because of the vicious 
circle principle. The same holds for pfs of the second type, because also in 
the Principia, parameters cannot be untyped. The third problem is a non- 
issue in the Principia. Formal contexts are not present in the Principza, 
but have been introduced in this Chapter to make a precise analysis of RTT 
possible. Propositional functions of the Principia are always constructed in 
one, implicitly defined, context. A formula, therefore, cannot contain two 
non-overlapping constituents that cannot be typed in the same context. 
This excludes pfs of the third type. 

We conclude that we have described the legal pfs of the Principia Math- 
ematica with the formal system RTT. 

We present some refinements of Theorem 2.84 that will be useful in 
future chapters of this thesis: 


Theorem 2.85 Assume TH f:t°. 
e If f = Ri, dry) and x € FV(f) then x:0° ET; 


e If f =z(ki,...,km) then there are wt... ube, b such that 


b 
— zal) ET; 
- T H kul forki € AUP; 
— kiut Er for ki € V. 


PROOF: 


e By Theorem 2.65, T(T) hy, Rii- ian) : o. This means that xu € 
T(T). Therefore, x:0° ET; 

e Let un ‚... „um, b be as in the proof of Theorem 2.84. We only need 
to check that kjut ET for k; € V. We already know that kT (up) € 
T(T), and as u is predicative, and the type of the variable k; in T 


must be predicative as well, we have kiut Er. 
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Strengthening Lemma | 2.56.2 | A31 
Free Variable Lemma 
Unicity of Types 
Subterm Lemma 
Strong Normalisation 


Figure 3: Comparison of the properties of RTT and modern typed A-calculus 


Conclusions 


In this chapter we gave a formalisation of the Ramified Theory of Types. 
Some of the main ideas underlying this theory were already present in 
Frege’s Abstraction Principles 1.1 and 1.2. 

RTT not only prevents the paradoxes of Frege’s Grundgesetze der Arith- 
metik, but also guarantees the well-definedness of substitution, as we have 
shown in Corollary 2.73. This second problem was not realized in the Prin- 
cipia, where substitution did not even have a proper definition. 

There is a close relation between substitution in Principia and -re- 
duction in A-calculus (Definition 2.24). RTT has characteristics that are 
also the basic properties of modern type systems for A-calculus. See Figure 
3. As there is no real reduction in RTT, we don’t have an equivalent of 
the Subject Reduction theorem. However, the fact that the Free Variable 
property 2.58 is maintained under substitution can be seen as a (very weak) 
form of Subject Reduction A.30. 

Expressing Russell’s propositional functions in A-calculus has made it 
possible to compare these pfs with A-terms. We found that pfs can be seen 
as A-terms, but in a rather simple way: 


e A pfis always a Al-term, i.e. if Ar: A.B is a subterm of the translation 
f of a pf f, then x € Fv(B); 


e The translation of a pf always results in a fully applied A-term in 
B-normal form; 


+ Substitution in the Principia can be seen as application plus £- 
reduction to normal form. 
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Although the description of the Ramified Theory of Types in the Prin- 
cipia is very informal, it is remarkable that an accurate formalisation of this 
system can be made (see Theorem 2.84 and the discussion that follows it). 
The formalisation shows that Russell and Whitehead’s ideas on the notion 
of types, though very informal to modern standards, must have been very 
thorough and to the point. 

Apart from the orders, RTT is a subsystem of A—, [30] via the embed- 
dings ~ of Section 2a2 and T of Section 2c2. There are, however, important 
differences between the way in which the type of a pf is determined in RTT, 
and the way in which the type of a A-term is determined in A-Church. The 
rules of RTT, and the method of deriving the types of pfs that was pre- 
sented in Section 2d, have a bottom-up character: one can only introduce a 
variable of a certain type in a context FP, if there is a pf that has that type 
in F. In A, one can introduce variables of any type without wondering 
whether such a type is inhabited or not. 

Church’s A— is more general than RTT in the sense that Church does 
not only describe (typable) propositional functions. In A—, also functions 
of type r — ı (where ı is the type of individuals) can be described, and 
functions that take such functions as arguments, etc.. 

A characteristic of RTT that is maintained in many modern type sys- 
tems is the syntactic nature of the system: type and order of a pf are 
determined on purely syntactical grounds. No attention is paid to the in- 
terpretation of such a pf. This is remarkable, as the propositions Vx:0°[R(x)] 
and Vx:0°[R(x)] v Vz:()?[z() A =z()] are logically equivalent in most logics?®, 
though they are of different type (the former pf has type ()' and the latter 
has type ()*°). In Section 3c we show that other viewpoints are possible 
besides this concentration on syntax. 


25 At least in all the logical systems that Russell had in mind when he wrote the 
Principia 


Chapter 3 


Deramification 


In this chapter we discuss the development of type theory in the period be- 
tween the appearance of Principia Mathematica (1910-1912) and Church’s 
formulation of the Simple Theory of Types [30] in 1940. 

In Section 3a we show that RTT was not a very easy system to work 
with. Ramsey [101], and Hilbert and Ackermann [64], simplified the system 
by removing the orders. The result is known as the Simple Theory of Types 
(STT). 

Nowadays, STT is known via Church’s formalisation in A-calculus. How- 
ever, STT already existed (1926) before -calculus did (1932), and is there- 
fore not inextricably bound up with A-calculus. In Section 3b we show 
how we can obtain a formalisation of STT directly from the formalisation of 
RTT that was presented in Chapter 2 by simply removing the orders. Most 
of the properties that were proved for RTT hold for STT as well, including 
Unicity of Types and Strong Normalisation. The proofs are all similar to 
the proofs that were given for RTT. We also make a comparison between 
Church’s formalisation in A-calculus and the formalisation of STT that is 
obtained from RTT. It appears that Church’s system is much more than 
only a formalisation. Because of the A-calculus it is more expressive. 

The removal of orders from type theory may suggest that orders are to 
be blamed for the restrictiveness of RTT, and that the concept of order is 
problematic. In Section 3c we show that this is not necessarily the case. 
We introduce a system KTT, based on Kripke’s Hierarchy of Truths [78], 
that has an approach completely opposite to STT. Whilst STT is order-free, 
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and types play the main role, Kripke’s Hierarchy of Truths is type-free, 
and orders play an important, though not a restrictive, role. The main 
difference between Kripke’s and Russell's notion of order is that Russell's 
classification is purely syntactical, whilst Kripke’s is essentially semantical. 
We show that RTT can be embedded in KTT (3c2), and that there is a 
straightforward relation between the orders in RTT and the hierarchy of 
truths of KTT. 


3a History of the deramification 


3al The problematic character of RTT 


The main part of the Principia is devoted to the development of logie and 
mathematics using the legal pfs of the ramified type theory. It appears 
that RTT is not easy to use. The main reason for this is the implementation 
of the so-called ramification: the division of simple types into orders. We 
illustrate this with two examples: 


Example 3.1 (Equality) One tends to define the notion of equality in the 
style of Leibniz ([53)): 


x = y © Vz[z(x) © z(y)], 


or in words: Two individuals are equal if and only if they have exactly the 
same properties. 

Unfortunately, in order to express this general notion in our formal 
system, we have to incorporate all pfs Vz : (0°)"[z(x) + z(y)] for n > 1, 
and this cannot be expressed in one pf. 


The ramification does not only influence definitions in logic. Some im- 
portant mathematical concepts cannot be defined any more: 


Example 3.2 Dedekind constructed the real numbers from the rationals 
using so-called Dedekind cuts. In this construction, a real number is a set 
r of rationals such that 


erfg; 


er #Q 
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elifxéerandy <x then yer; 
e If x Er then there is y € r with x < y. 


For instance, the real number } is represented by the set {x € Q| 2x < 1}, 
and the real number /2 is represented by the set {r € Q| a <0 or x? < 2}. 

If we take Q as the set of individuals A, and assume that the binary 
relation < on Q is an element of R, the set of relations, we can see real 
numbers as unary predicates f over Q such that 


3x:0°[z(x)] A 3x:0°[-z(x)] A 
Vx:0°[Vy:0°[z(x) — y < x — z(y)|] A (1) 
Vx:0°[z(x) — dy:0°[z(y) Ax < yl] 


holds if we substitute f for z. We will abbreviate the predicate (1) (with 
2 
the free variable z) as R. It has type ((0°)*) , and real numbers can be 


seen as pfs of type (0°)'. We will, for shortness of notation, write R(f) for 
Rlz:=f], so R = R(z). A real number r is smaller than or equal to another 
real number r’ if for all x with r(x), also r’(x) holds. We write, shorthand, 
r <r’ ifr is smaller than or equal to 7’. 

In traditional mathematics, the above would define a system that obeys 
the traditional axioms for real numbers. In particular, the theorem of the 
least. upper bound holds for this system. This theorem states that each 
non-empty subset of R with an upper bound has a least upper bound. In 
our formalism: 


( Iz,eR[v(zı)] A ) 
IzzeRVz3eR[v(z3) — z3 < zo] 


Vz2ER[v(z2) — ZJ < z4] A | | 


VvCR 
Ja ER | vager | VzaER[v(za) > za < zal 


=> Zi Í 23 


(We write, shorthand, VvCR{g] to denote 


vo:(0°)*) ‘[va:(0®)fo(u) — R(u)] — g], 
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and VzeR[g] to denote Vz:(0°)*[R(z) — g]). If we try to prove this theorem 
within the system of Dedekind as formulated in the Principia-language 
RTT, we have to specify a type t° for the variable zj. As zı must be a real 
number, its type must be (09)*. If we give a proof of the theorem, and 
construct some object f that should be the least upper bound of a set of 
real numbers V, f will depend on V. Therefore, a general description of f 
will have a variable v for V in it. As v is of order 2, f must be of order 3 or 
more. Therefore, f cannot be a real number, since real numbers have order 
1. This makes it impossible to give a constructive proof of the theorem of 
the least upper bound within a ramified type theory. 


This is a consequence of the fact that it is not possible in RTT to give a 
definition of an object that refers to the class to which this object belongs 
(because of the Vicious Circle Principle). Such a definition is called an 
impredicative definition. The relation with the notion of impredicative 
type is immediate: an object defined by an impredicative definition is of a 
higher order than the order of the elements of the class to which this object 
should belong. This means that the defined object has an impredicative 
type. 

Nowadays we would consider the use of the Vicious Circle Principle too 
strict. The impredicative definition of f is a matter of syntax, whilst the 
existence of the object f has to do with semantics. The fact that we are 
not able to give a predicate definition of f does not imply that such an 
object does not exist. Here we must remark that Russell and Whitehead 
did not make a distinction between syntax and semantics in the Principia. 
Therefore they had to interpret the Vicious Circle Principle in the strict 
way above. 


3a2 The Axiom of Reducibility 


Russell and Whitehead tried to solve these problems with the so-called 
aziom of reducibility. 


Axiom 3.3 (Axiom of Reducibility) For each formula f, there is a for- 
-mula g with a predicative type such that f and g are (logically) equivalent. 


"Though the basic ideas for this were already present in the works of Frege. See for 
instance Über Sinn und Bedeutung [49]. 
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Accepting this axiom, one may define equality on formulas of order 1 only: 
def O1 
x=ıy = Vz: (0°)*[z(x) = z(y)]. 


If f is a function of type (0°)” for some n > 1, and a and b are individuals 
for which the Leibniz equality a =, b holds then f(a) > f(b) holds: With 
the Axiom of Reducibility we can determine a predicative function g (so 
of type (0°)*), equivalent to f. As g has order 1, g(a) > g(b) holds. And 
because f and g are equivalent, also f(a) ++ f(b) holds. This solves the 
problem of Example 3.1. A similar solution gives, in Example 3.2, the proof 
of the theorem of the least upper bound. 

The validity of the Axiom of Reducibility has been questioned from the 
moment it was introduced. In the introduction to the 2nd edition of the 
Principia, Whitehead and Russell admit: 


“This axiom has a purely pragmatic justification: it leads to the 
desired results, and to no others. But clearly it is not the sort 
of axiom with which we can rest content.” 


(Principia Mathematica, p. xiv) 


Moreover, Wey] states that 


“if the properties are constructed there is no room for an axiom 
here; it is a question which ought to be decided on ground of 
the construction” 


(Mathematics and Logic: A brief survey serving as preface to 
a review of “The Philosophy of Bertrand Russell”, p. 5) 


and that 


“with his axiom of reducibility Russell therefore abandoned the 
road of logical analysis and turned from the constructive to the 
existential-axiomatic standpoint.” 


(Ibid., p. 6) 
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With the more modern developments of logic in our mind, we could add 
the following objection, associated to Weyl’s argument above, against the 
Axiom. The Axiom of Reducibility states that for each f, there is a pred- 
icative g that is logically equivalent to f. The function g is something at 
object level, but the statement “f is logically equivalent to g” is a statement 
at a higher level than the object level. Pfs exist (at least, in the syntac- 
tic construction) independently from the existence of the notion of logical 
equivalence. 

Moreover, there is more than one notion of logical equivalence, corre- 
sponding to the various kinds of logic that have been developed, or could 
have been developed. It would be remarkable if one Axiom of Reducibility 
would provide predicative pfs f for any kind of logic that is available, or 
can be thought of, and indeed, this is not true as shown by the following 
trivial example: 


Example 3.4 We consider so-called “bureaucratic logic”. This logic has a 
set of axioms A, and no derivation rules at all. In short, a proposition g is 
true if and only if g € A. Take, for the sake of the argument, 


A= {ge P| g:()°}, 


so A is the set of all predicative propositions. In this system, a proposition 
g is true if and only if it is predicative. If f is an impredicative proposition, 
then so is f > g, for any proposition g. Therefore, f > g is false for any 
proposition g, in particular for any predicative proposition g. So the Axiom 
of Reducibility does not hold in bureaucratic logic. 


Though Weyl [120] made an effort to develop analysis within the Rami- 
fied Theory of Types (but without the Axiom of Reducibility), and various 
parts of mathematics can be developed within RTT and without the Axiom”, 
the general attitude towards RTT (without the axiom) was that the system 


was too restrictive, and that a better solution had to be found. 


?See [67], where many algebraic notions are developed within the Nuprl Proof De- 
velopment System, a proof checker based on the hierarchy of types and orders of RTT 
without the Axiom of Reducibility. 
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3a3  Deramification 


The first impulse to such a solution was given by Ramsey in 1926 [101]. 
He recalls that the Vicious Circle Principle 2.1 was postulated in order to 
prevent the paradoxes. Though all the paradoxes were prevented by this 
Principle, Ramsey considers it essential to divide them into two parts: 


1. One group of paradoxes is removed 


“by pointing out that a propositional function cannot sig- 
nificantly take itself as argument, and by dividing functions 
and classes into a hierarchy of types according to their pos- 
sible arguments.” 


(The Foundations of Mathematics, p. 356) 


This means that a class can never be a member of itself. The para- 
doxes solved by introducing the hierarchy of types (but not orders), 
like the Russell paradox, and the Burali-Forti paradox, are called log- 
ical or syntactical paradoxes; 


2. The second group of paradoxes is excluded by the hierarchy of orders. 
These paradoxes (like the Liar’s paradox, and the Richard Paradox) 
are based on the confusion of language and meta-language. These 
paradoxes are, therefore, not of a purely mathematical or logical na- 
ture. When a proper distinction between object language (the pfs 
of the system RTT, for example) and meta-language is made, these 
so-called semantical paradoxes disappear immediately. 


Ramsey agrees with the part of the theory that eliminates the syntactic 
paradoxes. This part is in fact RTT without the orders of the types. The 
second part, the hierarchy of orders, does not gain Ramsey’s support: if 
a proper distinction between object-language and meta-language is made, 
the semantic paradoxes disappear. Moreover, by accepting the hierarchy in 
its full extent one either has to accept the Axiom of Reducibility or reject 
ordinary real analysis. Ramsey is supported in his view by Hilbert and 
Ackermann [64]. They all suggest a deramification of the theory, i.e. leav- 
ing out the orders of the types. When making a proper distinction be- 
tween language and meta-language, the deramification will not lead to a 
re-introduction of the (semantic) paradoxes. 
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The solution proposed by Ramsey, and Hilbert and Ackermann, looks 
better than the Axiom of Reducibility. Nevertheless, both deramification 
and the Axiom of Reducibility are violations of the Vicious Circle Principle, 
and reasons (of a more fundamental character than “they do not lead to 
a re-introduction of the semantic paradoxes” and “it leads to the desired 
results, and to no others”) why these violations can be harmlessly made 
must be given. Gödel [58] fills in this gap. He points out that whether one 
accepts this second principle or not, depends on the philosophical point of 
view that one has with respect to logical and mathematical objects: 


“it seems that the vicious circle principle [ ... } applies only if 
the entities involved are constructed by ourselves. In this case 
there must clearly exist a definition (namely the description of 
the construction) which does not refer to a totality to which the 
object defined belongs, because the construction of a thing can 
certainly not be based on a totality of things to which the thing 
to be constructed itself belongs. If, however, it is a question of 
objects that exist independently of our constructions, there is 
nothing in the least absurd in the existence of totalities contain- 
ing members, which can be described only by reference to this 
totality.” 


(Russell's mathematical logic) 


The remark puts the Vicious Circle Principle back from a proposition (a 
statement that is either true or false, without any doubt) to a philosophical 
principle that will be easily accepted by, for instance, intuitionists (for 
whom mathematics is a pure mental construction) or constructivists, but 
that will be rejected, at least in its full strength, by mathematicians with 
a more platonic point of view. 

Godel is supported in his ideas by Quine [100], sections 34 and 35. 
Quine’s criticism on impredicative definitions (for instance, the definition 
of the least upper bound of a nonempty subset of the real numbers with an 
upper bound) is not on the definition of a special symbol, but rather on the 
very assumption of the existence of such an object at all. Quine continues 
by stating that even for Poincaré, who was an opponent of impredicative 
definitions and deramification, one of the doctrines of classes is that they 
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are there “from the beginning”. So, even for Poincaré there should be no 
evident fallacy in impredicative definitions. 

The deramification has played an important role in the development of 
type theory. In 1932 and 1933, Church presented his (untyped) A-calculus 
[28, 29]. In 1940 he combined this theory with a deramified version of 
Russell’s theory of types to the system that is known as the simply typed 
A-calculus?. 


3b The Simple Theory of Types 


3b1 Constructing the Simple Theory of Types from RTT 


It is straightforward to carry out the deramification as it was originally 
proposed by Ramsey, Hilbert and Ackermann: We take the formalisation 
of RTT that was presented in Chapter 2, and leave out all the orders and the 
references to orders (including the notions of predicative and impredicative 
types). The system we obtain in this way will be denoted STT. The types 
used in the system are the simple types of Definition 2.31. 

The following definitions, lemmas, theorems and corollaries, including 
their proofs, can be adapted to STT without any problems: 2.43, 2.44, 
2.45, 2.46, 2.56, 2.57 (first free variable theorem), 2.58 (second free variable 
theorem), 2.59, 2.61 (unicity of types), 2.64, 2.65, 2.67, 2.71, 2.72, 2.73 
(existence of substitution), 2.74, 2.75 and 2.76 (subterm lemma). 

The description of legal pfs for STT follows the same line as in Section 
2d, with straightforward adaptions of 2.77, 2.78 (now, all simple types are 
inhabited), 2.81, 2.82, 2.83, and finally 2.84 (characterisation of legal pfs): 


Theorem 3.5 Let f EeP. f is legal (mod a) if and only if: 
e f = Ri, tee »ta(R)), or 


e f =2(k,...,kn), z Æ kj for all kj € V and z does not occur in any 
kj € P, and there is T with Fv(f) C DOM(T) and for all kj € P, 
TFk;:t;, or 


3Thus, the adjective simple is used to distinguish the theory from the more com- 
plicated — both in its construction with a double hierarchy and in its use — ramified 
theory. The classification “simple”, therefore, has nothing to do with the fact that STT, 
formulated with A-calculus as described in [30], is the simplest system of the Barendregt 
Cube (see Ac). 
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e f = of! and f' is legal (mod a) or f = fi V fo, there are T; and ti 
such that Tit- fiti (mod a) and Ti UT% ts a context, or 


e f=Vrx:t.f’ and f’ is legal. 


A comparison between the formalisations of STT and RTT can easily be 
made using Theorems 3.5 and 2.84. We find that 


e All RTT-legal pfs are (when the ramified types behind the quantifiers 
are replaced by their corresponding simple types) sTT-legal; 


oA STT-legal pf f is RTT-legal, except when f contains a subformula of 
the form z(kı,...,kn), where one or more of the kjs are not RTT-legal 
or can only be typed in RTT by an impredicative type. 


3b2 Comparison of STT with Church’s \— 


Nowadays, the Simple Theory of Types is often identified with Church’s 
formalisation of it in [30]. The definition of A— that was given there is 
repeated in Section Ab of the Appendix. 

We make the following remarks with respect to A— and the Simple 
Theory of Types. 


Remark 3.6 We see that the constants —, A, Va and 7g are terms. This 
may need some explanation for the modern reader. 


ə Church considers > and A to be functions. The function > takes 
a proposition as argument, and returns a proposition; similarly A 
takes two propositions as arguments, and returns a proposition. In 
Definition A.16, we see that — and A are assigned the corresponding 
types o— oando- 0-0; 


e More remarkable: Ya and 7a are just terms, and do not act as binding 
operators. The usual variable binding of V, and 7a is obtained via 
A-abstraction: instead of Vz:alfl, Church writes Va (Az:a.f). In this 
way, Va is a function that takes a propositional function of type a — 
o as argument, and returns a proposition (a term of type o). In 
Definition A.16, Ya obtains the corresponding type (a — 0) — o. 
Similarly, the choice operator ła takes a propositional function of 
type @ — o as argument, and returns a term of type a. The term 
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1z:a.f, or in Church’s notation: 1,(Az:a.f), has as interpretation: the 
(unique) object t of type a for which f[x:=t] holds. Correspondingly, 
the type of 1a is (a — 0) > a. 


The mappings T for types and ~ for terms (see Definition 2.40 and 
Definition 2.7), adapted for STT, make it possible to compare STT with 
A=. 

Regarding the types, we find that T gives an injective correspondence 
between types of STT and A—. T is clearly not surjective, as T(t) is never of 
the form a — ı (this follows directly from Definition 2.34). This indicates 
an important difference between STT and A—. In RTT and STT, functions 
(other than propositional functions) have to be defined via relations (and 
this is the way it is done in Principia Mathematica). The value of such a 
function f, described via the relation R, for a certain value a is described 
using the 7-operator: 7y.R(a, y) (to be interpreted as: the unique y for which 
R(a,y) holds). Things get even more complicated if one realizes that the 7- 
operator is not a part of the syntax used in Principia Mathematica, but an 
abbreviation with a not so straightforward translation (see [121], pp. 66-71). 
In A—, as everywhere in A-calculus, functions (both propositional functions 
and other ones) are first-class citizens, which means that the construction 
with the 7-operator is not the first tool to be used when constructing a 
function. If one has an algorithm (a A-term) that describes the function f, 
the value of f for the argument a can be easily described via the term fa. 
And even if such an algorithm is not at hand, one can use the 7-operator, 
which is part of the syntax of A. This makes A-+ much easier to use for 
the formalisation of logic and mathematics than RTT and STT. 

Regarding the terms, ~ provides an injective correspondence between 
terms of STT and A—. Again, this mapping is not surjective, for several 
reasons: 


e T is not surjective. As there is no t with T(t) = ı — ı, there cannot 
be a legal pf f such that f = Ax:ı.x (cf. Theorem 2.65.2 adapted for 
STT); 


e We already observed that f is a Al-term for all f € P. A— also allows 
terms like Ax:a.y; 


e If F = 2H,---H, for some z € V and some terms Hı,..., Hn, the 
His must be either closed A-terms, or variables, or individuals. This 
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means that there is no f € P such that f = Az:o-o.Ax:ı.z(Rx), 
since Rx contains the free variable x and is neither a variable nor an 
individual; 


e We remark that f is always a closed A-term, so there is no f € P 
such that f = x; 


e It has already been remarked that the 1-operator is part of the syntax 
of A, and this is not the case in STT and RTT. 


The discussion above makes clear that A— is a far more expressive 
system than RTT and STT. Type-theoretically, it generalises the idea of 
function types of Frege and Russell from propositional functions to more 
general functions. 

Philosophically, there is another important difference between STT and 
A—. The systems STT and RTT have a strong bottom-up approach: To 
type a higher-order pf one has to start with propositions of order 0. Only 
by applying the abstraction principles, it is possible to obtain higher-order 
pfs. In A—, one can introduce a variable of a higher-order type at once, 
without having to refer to terms of lower order. 


3c Are the orders to be blamed? 


The historical success of the deramification makes it attractive to conclude 
that the ramification of Russell's theory is to be “blamed” for the restrictive- 
ness of RTT: the orders were just an emergency measure, and by removing 
them from the theory, everything works fine. This reasoning, however, is 
a bit too fast. Orders still play a role in logic, and they provide a useful 
intuition to describe how complicated a certain proposition is (for example 
“first-order”, or “second-order” ). 

Moreover, we feel that there are reasons to criticise not the concept 
of order, but Russell’s definition of order. Russell’s classification of pfs in 
types and orders is purely syntactic. This is quite harmless as far as (simple) 
types are concerned: the number of arguments that a propositional function 
takes is a notion that can be reasonably described by syntactic methods.* 


“The only criticism that one could have is that Russell’s method excludes so-called 
“constant” functions, i.e. functions of which the outcome is independent of one or more 
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For orders, the syntactic classification is more questionable. Look for 
instance at the pfs f = Vx:0°[R(x)] and g = Vx:0°[R(x)] v Vz:()°[z() V =z()]. 
According to the Principia, g is a proposition of order 10, because g contains 
a variable of order 9. On the other hand, f, a proposition of order 1, is 
logically equivalent to g. So, the interpretation of g does not essentially 
involve the variable z of order 9. We could therefore argue that the order 
of g, for semantic reasons, should not be higher than 1, the order of f. 


In the forthcoming sections we show that there are workable systems 
that do have an order-like hierarchy and nevertheless are not restrictive. 
The system that we present however, does not make a syntactic classifica- 
tion of propositional functions into orders, but a semantic one. 


The system is based on Kripke’s paper [78]. In 1936, Tarski [117] proved 
that introducing a truth-predicate T in a first-order language leads to con- 
tradictions. Such a predicate T is true for objects that are encodings of 
true propositions, and false for objects that are encodings of false propo- 
sitions. For this reason, Tarski distinguishes between object-language and 
meta-language, and the truth predicate for propositions of the object lan- 
guage occurs only at the meta-level. For a truth predicate for propositions 
in meta-language one needs a meta-meta-language, etcetera. 


Kripke however, allows a restricted truth-predicate in a first-order lan- 
guage. The restrictions on this predicate are such that no contradictions 
occur. The construction of Kripke’s truth predicate has remarkable sim- 
ilarities with the use of orders in the Ramified Theory of Types, and we 
show that RTT can be embedded within Kripke’s Theory of Truths KTT. 
It is even possible to define a notion of order for pfs of RTT, based on the 
construction of Kripke’s truth predicate. An important difference is that 
this new notion of order is (partially) based on the interpretation of a pf, 
whilst Russell's definition of order is purely syntactic. 


In Section 3cl we describe KTT. In Section 3c2 we embed RTT in KTT 
and show that Russell's syntactic approach is much more restrictive than 
Kripke’s semantic approach. 


Parts of this section are based on [70]. 


of the arguments it takes. We saw this in our translation of pfs to A-terms: all the 
translations were Al-terms (see Lemma 2.23.3). 
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3c1 Kripke’s Theory of Truth KTT 


In this section, we shortly describe Kripke’s Theory of Truth: KTT (see 
[78]). Kripke expresses higher-order formulas within a first-order language, 
using the fact that many interesting languages are rich enough to express 
their own syntax via a Gödel Numbering. 

In the rest of this Section 3c, L is a first-order language. A is the set 
of constant symbols in L, F is the set of function symbols in L, and R 
represents the set of predicate symbols of L. We assume that M = (A, f J) 
is a model for L, where [ ] is an interpretation function for the symbols of 
L. 

Let us also assume two subsets Sı and Sz of A such that Si N S2 = 
Kripke extends L by adding a monadic predicate T. The main idea is 
to interpret T as a unary “truth predicate” T such that Sı contains the 
elements a of A that are (codes of) formulas which we consider to be 
“true”, and Sz contains those a € A that are (codes of) formulas which we 
consider to be “false”. This extension of the model is denoted as M($}, S2). 
We do not demand that S1 U Sa = A, hence T may be a partially defined 
predicate. 


Definition 3.7 (Logical Truth for Let s be an assignment function 
V — A. We define IN — f[s] as follows®: 


f Me ffs] M H ~is] 
Manan) | We vaml) E [R] | Cere) an) Z [R] 
gı Ag ME gifs] and ME gels) | ME (Cn) V (>g2)){s] 
g V 92 M E gifs} or ME gels} | MK ((-91) A (-g92)){s] 
Velg] ME g|s{e:=al| for alla € A ME (Fel—g))[s] 
Ja [9] ME gls[x:=al] for an a € A ME (Va[-g])[s] 
=g ME als] ME (Ag)ls] 
Here, R € R has arity m; a,a1,...,@m are terms of L, and g,gi,g2 are 


formulas of L. Moreover, s[z:=a} is the assignment function that assigns a 
to x, and s(y) to any variable y € Y \ {x}. Now let S1, S2 C A such that 


SNotice that even though this definition is different from Tarski’s definition, especially 
with respect to the definition of M f= ~f, it is easy to prove the equivalence of both 
definitions. This is because all primitive predicates of L are totally defined. We took this 
definition however, as we need to extend it for the partial predicate T. 
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Sı N Sz = Ø. L(T) is the extension of L with the monadic predicate T. We 
extend the definition of M E f to M(S1, S2) F f by putting 


MS, S2) H (T(a))[s] iff als) € Sı 


and 
M(S1, S2) F (AT(a))[s] iff als) € So. 


It is important (and easy) to notice that the extension of L and M to L(T) 
and M(S1, S2) is conservative: 


Lemma 3.8 Assume f is a sentence in L, and S1, S2 C A such that Sı N 
Sy =Ø. Then ME f if and only if MS), S2) = f. R 


The predicate T cannot express truth in a direct way. This is because T 
is a predicate in a first order language, and therefore can only take terms 
(objects) as arguments, and not propositions. However, there is an indirect 
way in which T can express truth: enumerate all formulas of L(T). 

From now on, we assume that we have an injective, primitive recursive 
function ( ) from the formulas (including non-closed formulas) of the first 
order language L(T) to the terms of L. If f is a sentence then we can form 
the proposition T((f)), which expresses the truth of f. But we can also 
form the proposition T((T((f)))), so we can discuss the truth of the truth 
of f, etcetera. This makes it possible to express higher-order propositions. 

As announced, we use the sets Sı and Ss to establish the truth of such 
higher-order propositions. Actually, we build a hierarchy of sets So, ı and 
Sa,2 for ordinals a. We will see that this hierarchy has much in common 
with Russell’s hierarchy of orders. 


Definition 3.9 For any ordinal œ we define a pair of sets (Sa 1, 50,2) and 
a model Ma. 
def def 
° Soi È Eg; So2 = B; Mo = M(So,1, So,2); 


e If Sai, Sa,2 and IN, have been defined, then we define: 


So41,1 E {[(f)] | f is a sentence and Ma H f} 


Sa+1,2 s {[(f)] | f is a sentence and IN, H =f} 


def 
Mari = MSorı1,Sarı2); 
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e If a is a limit ordinal and Sg, Sg and Wi; have been defined for all 
l < a, then 


def 
Sa = U Spi 
Ba 


Ma SF MSIE. 


The proof of the following lemma is not difficult. Yet, the lemma plays 
a crucial role in the rest of this Section. 


Lemma 3.10 (Conservation of Knowledge (1)) 
Ifa < gB then Sal C Sp, and Sa? C Sg 


PROOF: Induction on 8. The only non-trivial case is 8 = 8’ +1. By the 
induction hypothesis, Sa: C Sgi for all a < A’, so it suffices to prove that 
Spa C Sp 

We only give the proof for the case Sg; the proof for Sg» is similar. 

So assume a € Sg 1. Determine y < p’ and a sentence f such that [(f)] = a 
and M, = f. Use induction on the definition of M, = f. We treat only 
one case, the others are trivial: Assume f = T(a’) for some term a’ of L(T). 
Then [a’] € $,,ı by definition of Dt, = f. By the induction hypothesis 
on « we know: S, 1 C Sgr, so: [a] € Sp. By definition, this means 
Me = f. Hence [(f)] € Sg411, and this means a € Sp. B 


Corollary 3.11 (Conservation of Knowledge (2)) 
If a< fg and Ma H f then Mg E f. R 


Remark 3.12 It is not the case that Ma | f implies Mg Ff for a < B. 
For instance, let f = Vx[R(x) V —R(x)|. Notice that Mo = f. Therefore, 
(f) € S11, so Mı j= T((f)). But So, = 2, so Mo ET). 


We prove that the theories with the new predicate T are all consistent: 


Lemma 3.13 Let a be an ordinal. 


1. For all formulas f of L(T) and for all assignments s, 
Ma E (f A-f)[s]; 
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2. Sa, N Sa, = ø. 


PROOF: We use induction on a. So assume the lemma has been proven for 
all 2 < a (IHI). 


1. Use induction on the structure of f. So assume the lemma has been 
proven for all subformulas g of f (IH2). We treat three cases only (the 
other cases are similar). 

e f = R(@,-.. am). EM. H (fAnf)[s] then (ai(s),...,@m(s)) € 
[R] and (a1(8),...,am(8s)) ¢ [R], which is impossible; 

ef = Ag. If Ma & (f A nf)[s] then Ma F (91 A g2)[s}, so 
Ma = gils] for j = 1,2. Also: Ma H (-(g1 A gs))[s], so Da F 
((>g1) V (>g2))[s], so We H (—g;)[s] for j = 1 or j = 2. Therefore 
Ma F (gj A7g;){s] for j = 1 or j = 2, which contradicts (IH2); 

+ f =T(a) for a term a. If Ma F (f A =f)[s] then als) € Sa, and 
als) € Sz, which contradicts (IH1); 


2. Ifa € Sa, N Saz, then determine a formula g such that [(g)] = a, and 
Bi, b2 < a such that Mg, H g and Mg, FF ~g. Let B = max(Pı, b2). 
Then 8 < a and because of Conservation of Knowledge 3.11, Mtg E 
gAng. This contradicts (IH1). 


DI 


We can see the construction of the models Mte as a process of obtaining 
knowledge. At the initial stage 0, T((f}) is not defined for any formula f. 
There is no knowledge at all. 

By applying the definition of truth given for Mo, we obtain knowledge. 
Some sentences f can be judged true: Mg = f. We store the code of f 
in S;. Some other sentences g can be judged false: Mo H ~g. The code 
of g is stored in Sj. It is not possible to judge all sentences. For in- 
stance, neither Mo FE Vx[T(x) V -T(x)] nor Mo H 7~Vx[T(x) V -T(x)] holds, 
so [(Vx[T(x) V -T(x)]}] neither belongs to $11, nor to Sj. 

The knowledge we have obtained is expressed by the behaviour of the 
predicate T in Mı. At stage 1 we know more about T than at stage 0 
Son = So2 = 2, but Si1 A Ø and Sip # Ø. Hence more sentences 
can be judged true or false. We store the codes of sentences that were 
judged “true” at level 1 in So and codes of the sentences that were judged 
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“false” at level 1 in 523. The Lemma on Conservation of Knowledge 3.10 
guarantees that this process only extends our knowledge, i.e.: 


e Sentences that were judged to be true at level 0 remain true at level 


1; 
e Sentences that were judged to be false at level 0 remain false at level 
1. 
By iterating this process we arrive at the levels 2,3,...,w,w+1,.... One 


might expect that for each sentence f there is an ordinal a for which f € 
Sai U Sas. This is not the case. There are sentences of which the truth 
will never be established. See the forthcoming example 3.30. 


3c2 RIT in KTT 


Both in RTT and in KTT we are confronted with a hierarchy. Russell con- 
structs a hierarchy by dividing propositions and propositional functions 
into different orders, taking care that a propositional function f can only 
depend on objects of a lower order than the order of f. 

Kripke does not make this distinction beforehand. He has only one 
truth-predicate (T), but decisions about the truth of propositions are split 
into different levels: At the first level only decisions about propositions 
that do not involve any knowledge about T are made (for example: the 
proposition R(a;), but also the proposition (R(a;) V-R(aı))V T(a2)). At the 
second level decisions about propositions involving T for codes of first-level 
propositions are made, and so on. 

Before we can compare RTT with KTT, we must give a formal definition 
of logical truth for pfs of RTT. After that, we investigate the similarity 
- between both hierarchies in subsection 3c2.2 by describing RTT within KTT. 
In subsection 3c2.3 we investigate in which way RTT is more restrictive with 
respect to self-reference than KTT. 


3c2.1 Logical truth for RTT in Tarski’s style 


As KTT uses Tarski’s notion of logical truth, we use a similar notion for 
RTT. This definition of logical truth is quite informal. For example, the first 
clause “If (ai,.-.,@m) € R then RTT E R(a1,..., am)” requires the symbol 
R to be already fully interpreted and to denote a relation independently 
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of any Tarskian assignment function. This is in line with Russell, who did 
not make distinction between the syntactic symbol Rin “H Rl(aı,..., Am)” 
and the semantic use of R in “(a1,...,&m) € R”. We take care that it will 
always be clear whether we use a symbol in its syntactic or in its semantic 
way. 

We must remark that the definition of logical truth for RTT is not due 
to Russell and cannot be found in Principia Mathematica. In Principia, 
Russell and Whitehead use a notion of truth based on derivations in natural 
deduction style. But in order to make clear the similarities between RTT 
and KTT, we must use the notion of truth used in KTT in RTT as well. 


Definition 3.14 (Logical Truth for RTT) Let T be an RTT-context with 
domain VY. Let f € P be a legal pf in I with free variables rı,... ‚an of 
types tI',... ‚tar. Let s: V — P be such that s(z;) : tf € T. We define 
RTT Fr f[s] by induction on the order of f. We give this definition by 
induction on the structure of f. 


e RTT Er Rl, ,tacry)[s] if (d1(s),… stacey (s)) € R; 
e RTT Er (91 V g2)[s] if RTT Er gi[s] or RTT Fr als]; 


e RTT Fr (-g){s] if RTT Fr gls}; 


e f=z(kı,... ‚kn). The order of f[z:=s(z)] is lower than the order of 
f. Therefore we can define: RTT Er f[s] if RTT Fr (f[z:=s(z)])[s]; 


e f = Vua:t*[g]. The order of g is equal to the order of Vx:t*[g], so we 
can assume that RTT Fr’ g[s’}] has already been defined for contexts 
T” and valuations s’. Therefore we can define RTT Fr (Vz:t*[g]) [s] if 
for all h € PUA for which T \ {z:t°} F h : t°, RTT Ffe.e) gls[x:=h]]. 


Here ['[z:t*] is the same context as I’, except that we now assign the type 
t° to the variable x. We write RTT F f instead of RTT Fr f if it is clear 
which context I’ is used. 


3c2.2 RTT embedded in KTT 


To embed RTT in a first order language L, we have to cope with two tech- 
nical problems: 
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1. We need to encode the notion of and the manipulation with (higher- 
order) propositional functions into a first-order language. The manip- 
ulation is particularly important with respect to substitution, which 
in the higher-order situation is much more complicated than in the 
first order case (cf. the definition of substitution 2.24); 


2. In Russell’s theory, it is only possible to quantify over a part of all 
propositions. This makes it impossible to translate, for instance, the 
proposition Vp:()![p()V-p()] directly to Vx[T(x)V-T(x)|, as the quan- 
tifier in the latter also quantifies over (codes of) higher-order propo- 
sitions. 


As we do not want RTT-contexts to be involved in this coding, we assume 
that each variable in RTT (implicitly) has a superscript t, indicating its 
ramified type. We only consider the legal propositional functions of RTT, 
and given a context I it is always possible to assign a unique type to each 
free variable in such a pf (cf. Lemma 2.56.1). Therefore we can do without 
contexts, as the types of the variables are now clear from the function in 
which they occur. For reasons of clarity, we will not explicitly write this 
superscript, as long as no confusion arises. 

We propose the following solutions to the problems sketched above (we 
first give the definition and afterwards explain our thoughts behind it): 


Definition 3.15 Extend the language L(T) with for each ramified type t 
a monadic predicate Typ,, and for each n € Na (n+1)-ary function App. 
We code the typable propositional functions f of P as formulas f in this 
extension. We do this by induction on the structure of f. 


e If f = Rlü,... lar), then f is present in the original language L 


and we take f = f; 


e Iff = fi V fo, then f@ Ji V fa; 
e If f=-f", then F SF; 


e If f = Vx:u|f'], then 72 Va[ATyp,, (x) V F’; 


This mapping ~ is different from the mapping ~ that was used in Chapter 2. 
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o If f = z(kı,...,km), write K; = (ki) for k; € PT, and K; = k; for 
ki € AUV. Define F £ T(app„. (2, Ki, Km)). 


Notation 3.16 To keep notations uniform, we sometimes want to speak 
about (7) when we only intend to mention x, for x € V, and about (a) 


when only meaning a, for a € A. Hence, we formally define: (7) ef 2 and 


(a) df a for all z € V and all a € A. 


We now give a formal interpretation of the newly introduced predicate 
symbol Typ, and function symbol App,. We take A as domain of our model, 
so: A = A. This corresponds with the fact that Russell did not make a 
distinction between syntax and semantics. The following definition is also 
based on this fact: 


Definition 3.17 We define the function [| ]: 
e [a] =a for all a € A; 
e [R] = R forall RER; 


© [Typo] & A, and for t # 0°, [Typ,] = {MI | f € P, f:t} 


e We do not give a full semantics of the function App,. This is because 
we need App, and its semantics only in some special cases. Assume 
n € N, f € P is of type (tı,...,t„) and has free variables zj <--- < 
Zn- Also assume ky,...,kn € AUP, ki: t; We define: 


[app IKK. KEE KE)D E (Fer zekr Fal) 


Together, A and | | form a model ® for the translation of RTT in KTT. 


We make some remarks with respect to these definitions. 


Remark 3.18 It is clear that the newly introduced functions App, can be 
used for carrying out substitutions, thus solving the first of the technical 
problems stated at the beginning of this subsection. The predicates Typ, 
(typability with type £) solve the second problem, as can be seen in the 
definition of Vz-u[f]. 

"Observe: ( and ) do not belong to the syntax of L extended with T, Typ, and App, . 


We define K; to be the encoding of the translation of ki, and not the list of symbols 
started with (, followed by the list of symbols that represent k;, and closed by ). 
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Remark 3.19 At this point, our work is related to (but independent of) 
Paul Gilmore’s work on NaDSet 1. NaDSet 1 is a theory of generalised 
abstraction which makes n-ary predication a primitive of the system, with 
the unary truth predicate being trivially definable upon this basis. For a 
useful connection between KTT and NaDSet 1, see [44]. 


Remark 3.20 The extensions of L(T) with the relation symbol Typ, and 
the function symbol App,, are of a mere technical character. Therefore, we 
think that we can still speak of an embedding of RTT within KTT. 


Below, we work in two systems: RTT and KTT. These systems have 
a different notion of substitution, though they use the same notation for 
expressing substitution. From the context however, it will always be clear 
which kind of substitution is meant. 

The language L(T) extended with Typ, and App, is an example of the 
languages described in Section 3cl, and we can construct Ra for each or- 
dinal « as described in that section. 

Substitution in the language KTT is ordinary first-order substitution. 
Higher-order substitutions in KTT can be obtained via the application op- 
erators App,. For future results, it is essential to know that the combination 
of first-order substitution and the operators App,, in KTT is compatible with 
the higher-order substitution for RTT that was defined in Definition 2.24. 
This is shown in the following Lemma (we write f{x;:=g;|?_, as shorthand 
for A OR ERBEN) 


Lemma 3.21 (Substitution Lemma) Let g be a legal propositional 
function such that Fv(g) = {z1,...,tm}, and ky,...,km E AUP such 
that x; and k; have the same type for alli. Let p be at least the order of 
gle:=k]7, , and q at least the order of gle;:=kil?_,. Then 


Rp H glas=hi kay [e= n 


if and only if 
m 


Ra H glei: =kili_, [2::= (ki)] ient1' 


PROOF: We prove the following two statements 
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% ger ee]; 
if and only if 
R, F glei: =k; ila (zi: =(ki ee 


NR, F gles=klin, [2:=(ki)];-, 


if and only if 
Rg H= glei: =k]; (zi: =(k; Na 


It is necessary to prove these two statements instead of the single statement 
of the Lemma because of the special way in which we defined Ra | ~f. 
We use induction on the order of g. So assume (induction hypothesis A) 
that the substitution lemma is proved for all g’ with an order smaller than 
the order of g. Use induction on the structure of g[z;:=k;]?7]| (induction 
hypothesis B). 


o glei: kl, =1 1 = R(ü,. x -ta(R))- 


1. Notice: glai:=k Pa Pi =) la = g[zi:=ki]"_, (zi: rl hanes 
As the truth of R(aı,...,a.(r)) can always be established at level 
0 (so in Ro), there is Docking to be proved; 


2. Similar; 
o gjrii=ki ZE = 91 V g2. 


1. The following statements are equivalent: 


Ry H g V galei= (kini 

%,F (Hi V 92) [e:=(ki)];.; 

Rp H i [ei= (ki) ien Rei) 

R, FT [2:=(%)];_., for some j € {1,2}. (2) 


As the order of g; is at most the order of gl2;:=k;]?7} j, and the 
order of g;[&n:=kn] is at most the order of g[x;:=k,]?_,, we can 
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apply induction hypothesis B. Therefore, (2) is equivalent to the 
following statements: 


nk ile, for some j € {1,2}; 

Re F Alan: =kn] [ri = (Fi) ar V 
galen:=krl [= (Fi) ieni 

my H (nekvel) EY a 

Ry F (91 V 92)len:=kn] [2i: =(Fi) enpa 


2. Similar; 
.ola:=klt] = og. 

1. This is induction hypothesis B(2) on g’; 

2. The following statements are equivalent: 


Mp lee] 
lee] 
Rp F g g' [ze =(ki N (3) 


By induction hypothesis B(1) on g’, (3) is equivalent to the fol- 
lowing statements: 


Ra F g'[En:=kn] [erm k 4; 
R, H mng [tn:=kn] [2:=(k:)] ans 
Ry = 779 [zn:=kn] [zs=(ki) a 
e g|z;:=ki RI = = Va:t|g'. 
1. The following statements are equivalent: 

Ry F Vaitlg’| [e= (k; lien: 

R, F Va[-Typ,(x) vg] [2:=(ki)];.; 

R, E Ve Type) v F e= ET]; 

Rp H g [e= F] n [z= (h)] for all A: t. (4) 
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By induction hypothesis B(1) on g’, (4) is equivalent to the fol- 
lowing statements: 


RE ee]; [z:=(h)] for all h: t; 
R, H Va [-typ.(2) V g'[En:=kn] =] 
KR, F Vr I-Typ,(2) v glen:=k 3j [z:: =(k; NEE i=nt1 
Ry f= Ve:tlg lie =k] lei: =(ki ye =n+1' 
Ry H Vartlg!len:=kel la E agn 

2. Similar; 

® glei: =k; ity = = = zh, ng hr). 
1. If £p Æ z, then 
glei: =(k; aa = = glen: =k nl z: S en 
and there is nothing to be proved. So assume z = £n. Let yı < 


-- < Yr be the free variables of kn. The following statements are 
equivalent: 


EC 
Rp FT (App, (en, (Ar), (hr) [t= (he) ) ieni 
Rp FT (App, (ln), (li), (hr))) [i= ihien (5) 
Let hi =k, if hj =a, and h; = h; if hj € {tn41,-.-,%m}. Then 
(5) is equivalent to the following statements: 
R, ET (app, (Pa), (A) Ar); 
Rp F nly; =h; =1) 
Rr: F kn lyy =h] leik] Eny (6) 
We can use induction hypothesis A: the order of kn is smaller than 
the order of z(h1,...,h-) and therefore smaller than the order of 


g (see Corollary 2.63).). Thus (6) is equivalent to the following 
statements: 


Ra E kr lyj:-=hjlr [eim 23) RR 
My H zh, hr anke) ees 
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2. Similar. 
& 


Corollary 3.22 Assume g is a pf of order p and g|x:=k] is a proposition 
of order q. If Ry E gle:=k] then A, H g |e:=(k)]. R 


Remark 3.23 We have actually proved a stronger fact: Assume g is a 
propositional function of order m and glx:=k) is a proposition of order n. 
If Rn H gle:=k] then A, F Gla:=(k)], where p = min(m,n +1). This 
tells us more about the role of the predicate T: Although a substitution 
may lower the order of a propositional function by more than one, only 
one application of the T-predicate is involved (hence only one level in the 
hierarchy of truths). However, in the theorem below we only need the 
(weaker) form in which we presented the Substitution Lemma originally. 


We now prove the main theorem of this section. 


Theorem 3.24 (Embedding of RTT in KTT) Let T be an RTT-contest 
with domain V. Let f € P be a legal pf in T with free variables Ti, ‚En 
of types tP... ‚tar. Lets: V — P be an assignment function such that 
s(zi): t ET. Letn be the order of f[s] = fIxi:=s(x;)];-.- 

Then RTT H f[s] if and only if Rn E FIs]. 


PROOF: 


=> As in the proof of Lemma 3.21, we have to deal with the special way in 
which we defined KTT„ F =f. Therefore, we simultaneously prove: 
1. If RTT  f[{s] then A, H FIs]; 
2. If RTT E ~f [s] then Rn H ~f [s]. 
The proof follows the same induction structure as the definition of 
RTT f= f[s] (3.14). 
e f = R(a1,... ,aa(r)) for some RE Rand some a, , Gacr} € AUV. 
Notice that f = f and that f[s] = f[s]. Write 
fis] = R(aî,… aa) 
for certain aj, , QR) € A. As RTT H R(a},... Qa Ry)» we know 


that (a4,...,a, ry) € R, hence Rn E Ra, aar): The proof 
is similar for >f; 
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e f= V 92. 


Then gı and gz are legal. Assume n; is the order of g;[s]. Notice 
that nj <n. 
First assume RTT F f[s). A 


192] 


fis} = (Vals 
(2.30.1) 


gils] V gofs), 


we have RTT F gj[s] for j = 1 or j = 2. By the induction hypothesis 


on the structure of f: Ra, F gjlsl, and as nj < n: Ra E gjlsl. 


Therefore, Ra H gıls] V gels]. Now observe that 
fs = (Valle 
als] v gals}. 


Now assume RTT E —f[s]. By a similar argument, we find that 
RTT f= g;[s] for j = 1 and j = 2. By the induction hypothesis on 
the structure of f, Rn; F =gj[s] for j = 1,2, so Ra F =91[s]A-g2[s], 
or in other words: 


Ra E- (afsl v oale) 


Observe that — (als v 6) = fl]: 


f = 7g. 
If RTT E f[s] then use the induction hypothesis on the structure of 


g to get Rn H gls], hence R, E fs]. 
If RTT p= fs], then RTT F gls], so by induction on the structure 
of g, Rn F gls], so Rn H =>gls], so An F =f [s]; 

Í = Ve:tlg). 

If RTT — f[s] then for all k:t, RTT E gls[x:=k]], where s[x:=k] is 
the assignment function that assigns k to x, and s(y) to all y € 
V \ {x}. By the induction hypothesis on the structure of f, we 
know that for all k : t, Rn, F glsle:=k]], where ng is the order of 
g{s[x:=k]] = glei:=s(zi)]|x:=k]. By Corollary 3.22 we have: For all 
k:t, 


A, E gle=s(2:)];-l2:=(k)]. 
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Hence, for all a € A, 
Rn = (-Typ,( x) V glen=s (a) [z:=a]. 
So R, H Vr [-Typ,(2) V gleze: Observe: 


Va:t [glass (zi) 


Vz [-Typ,(2) v Fee) 


N 


The argument for RTT F ~f is similar; 

e f = 2(h1,...,hp). Determine q such that z = zg. Write hi = hj 
if hj ¢ {£1,..-, Em}; hj = s(ze) if hj = ze. Assume that s(x,) has 
free variables yı <:-- < yr. Observe: 


fle:=s(ei)lı = s(za)ly;:=h;]'_, 


So if RTT H f[ei:=s(&;)];, then RTT F s(2,)[yj:= hl . As the 
order of s(x,) is smaller von the order of f, we can use neten on 
the order of f to obtain (with Conservation of Knowledge): Rn H 
s(zo) (ys: zil „ Which is equivalent to: R, F Tnt 
The proof fer Bet H oflrie=s(zi)l, is similar; 


< This is easily shown now by contraposition. Assume, for the sake of the 
argument, that RTT = f does not hold. Then RTT | ~f, so by the > 
part of the theorem (that was proved above), Rn F af. So, if R, E7 
then RTTE f. 


This theorem clearly shows the relation between the orders in RTT and 
the levels of truth in KTT. The heart of the proof of Theorem 3.24 is 
in the proof of case £n(h1,...,hr) of the Substitution Lemma 3.21 (via 
its corollary 3.22). This is the only place in the proof where the prop- 
erties of the predicate T are used. It is understandable that these prop- 
erties must be used at exactly this place when we look at the definition 
of propositional functions and the typing rules for propositional functions. 
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Exactly the possibility of constructing a propositional function of the form 
Xn(hı,...,h,) makes it possible to arrive at higher-order propositional func- 
tions and higher-order propositions. So exactly at this spot, Kripke’s pred- 
icate T must appear, in order to raise one level in KTT as well. 

One might expect to need the properties of T in the proof of the case 
f =z(h1,...,hp) of Theorem 3.24 as well. But we see that this is not the 
case. This is understandable: we do not consider the truth of z(hı,..., hp) 
itself, but of z(hı,... ,h,)[zi:=s(z;)];-,. And we do not work with the order 
of z(hı,...,hp), but with n, the order of z(hı,...,h,)[ei:=s(zi)];_). The 
shift to a lower order has to do with the orders of z(hı,...,h,) and s(xq), 
but not with the orders of z(hı,... ,hp)[zi:=s(xi)];_, and s(za)lyj=h;] 
These last two propositions are syntactically equivalent and therefore of the 
same order. 


Corollary 3.25 ffe P is a legal proposition of order n, then RTT = f 
if and only if Rn = f. R 


Corollary 3.26 (Conservativity of Ru over RTT) RTT f if and only 
FR ET. B 


We cannot improve the result of Theorem 3.24 in general: For all n, 
there are propositions f of order n in RTT whose code is provable at level 
Rn in KTT, but not at any lower level. 


Theorem 3.27 Letn > 0, and let fn be the nth-order-proposition 
Vp:()"~*[p() V PO). 


Then: 
Rm H fn if and only ifm >n. 


PROOF: 


< follows from Theorem 3.24 and Lemma 3.10; 


> is by induction on n. Observe that 


Jn = Vp[-Typga-ı (P) V (T(Appo(p)) V =T(Appo(p)))]. 
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e n= 1. Let g be any proposition of order 0 in RTT. Then Ro E 
Typge((9)) but as T is completely undefined at level 0 (So, = So,2 = 


? 


Ro  T(Appo((g))) V ~T(Appo((9)))- 
Hence, Ro fi; 


e Assume the theorem has been proved for n — 1. Assume m < n 
and Rm FE fn. By definition of H, we have: 


Rn-ı F T(APPo((fn-1))) V mT(APPo((fn—1))); 


and for reasons of consistency: Rm f= T(Appo((fn-ı))). Therefore, 


Rm F Tl(fn-ı)), so, by the definition of T: Rm-1 F fn-ı, which 
contradicts the induction hypothesis, asm — 1 <n— 1. 


z 


There are however, for any n, propositions f of order n in RTT for which 
Rm H f or Rm F af can already be established for m < n. 


Example 3.28 Consider a proposition g = gı V g2 where gj is a true 
proposition of order m and g2 is any proposition of order n > m. As gı is 
true in RTT, we have Rm F gi, and therefore Rm F. 


3c2.3 The restrictiveness of Russell’s theory 


We illustrate the different approaches of Russell and Kripke by an example 
given by Kripke himself in [78]. 


Example 3.29 Two politicians, Wim and Frits, are quarrelling about 
who is telling the truth and who is lying. Of course, Wim states that any- 
thing said by Frits is untrue (A), and Frits argues that any statement of 
Wim is false (B). The utterances (A) and (B) can be complete nonsense, 
but they can also be meaningful. This does not only depend on the (syn- 
tactic) structure of (A) and (B), but also on their semantics, that is: on 
the utterances of Wim and Frits (which may be more than only (A) and 


(B)): 


® Any correspondence with existing Dutch politicians is purely coincidental. 
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1. Assume, (A) is the only statement that Wim makes, and (B) is the 
only statement that was made by Frits. Then (A) and (B) are non- 
sense. More precise, there is no reason to believe that (A) is true, and 
there is no reason either to believe that (B) is true. Namely: if we 
want to prove that (A) is true, we must show first that all statements 
of Frits are false, in other words: that (B) is false. But in order to 
establish the falsehood of (B) we must first find a true statement of 
Wim, that is: we must prove (A). Summarising: The truth of (A) can 
only be established if the truth of (A) has already been established 
before. So the truth of (A) will never be established. 


Similarly, we show that the falsehood of (A) will never be established, 
and that neither truth nor falsehood of (B) will ever be established; 


2. Now assume that (B) is still the only statement made by Frits, but 
that Wim has not only uttered (A), but also argues that that one 
equals one (C). Statement (C) is clearly true. This means that Frits 
has been lying. Therefore, Frits’ only statement (B) is false, hence 
Wim’s statement (A) is true. 


We formalise this situation as follows. We assume that the first order 
language L contains at least three relation symbols, W, F and =. Moreover, 
we assume to have an individual symbols 1. W will be interpreted as the set 
of (codes of the) utterances of Wim, F shall represent the set of (codes of 
the) utterances of Frits. In this way, we can encode the expression (A) by 


A = Vx[-F(x) V -T(x)]. 


Here, T is the truth predicate as introduced earlier in this Section. Similarly, 
(B) is encoded by 
B = Vx[-W(x) V -T(x)] 


and (C) is encoded by 
C=1=1. 
We model the situations 1 and 2 above as follows: 
1. In the first situation we take N as our domain. The semantics of W 


must represent the set of utterances of Wim, so we let W = [WwW] = 
{[(A)]}. Similarly, F = [F] = {[(B)]}. Further we let [1] =1. This 
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gives us a model M! for L. We can build a hierarchy M} (for ordinals 
a) as explained in Section 3cl. We show that there is no ordinal a 
for which M} H A. 


Assume, a is the smallest ordinal for which M} = A. Then M} H 
(-F(x) V AT(x))[s[x:=n]] for all n € N and all assignment functions 
s: V — N. This means that for all n € N, either n ¢ F or n € Saa. 
Notice that [(B)] € F. Therefore: [(B)] € Sa2. But then there 
is 8 < a such that M} j= ~B. Using the definition of M} = —B, 
this means that there is n € N such that n € W and n € Sg). The 
only candidate for such n is [(A)], as this is the only element of W. 
Hence: [(A)] € Sg. Therefore, there is y < £ for which M f= A. 
This is a contradiction because y < a, and a is the smallest ordinal 
for which M FA. 


In a similar way we can show that for all a, M} K —A, ML K B, 
and M} j ~B. This corresponds to our earlier conclusions that the 
sentences (A) and (B) are nonsense if they are the only utterances of 
Wim and Frits; 


2. For the second situation, we change the model M! to a model M? by 
replacing W by {[(A)], [(C)]}. Notice that Mê = C, because 1 = 1. 
Therefore, [(C)] € Sı 1, so M? E T((C)). As [(C)] € W, we also 
have M? H W((C)). Therefore, M? H (Wx) V AT(x))[s[x:=[(C)]}, 
for any assignment function s: Y — N. Hence Mt? = AB. 

This shows that Frits’ statement is indeed false, but it also shows that 
Wim’s statement (A) is true at level 2: As M? H =B, [(B)] € 5, 
and this implies that MÊ H A. 


In Russell’s terminology it would not be possible to type expressions 
like A and B at all. The expression A involves B, and therefore has to be 
of higher order than B. Similarly, B involves A, so it has to be of a higher 
order than A. 

This indicates an important difference between RTT and KTT: Kripke 
allows much more expressions to be included in the system. In some situ- 
ations these expressions will never obtain any truth-value (like A and B in 
the first example), but in other situations (so with other definitions of the 
primitive predicates) the same expressions will get a truth-value. Kripke 
concludes: 
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“it would be fruitless to look for an intrinsic® criterion that will 
enable us to sieve out — as meaningless, or ill-formed — those 
sentences which lead to paradox” 


([78], p. 692) 


Example 3.30 Another, more formal, example of a proposition f in KTT 
for which there is noge P with g= f is the proposition 
f Ë valt) V oT(x)). 

Notice that this is an impredicative proposition. It expresses that all 
propositions are either true or false, including f itself. 

Assume, for the sake of the argument, that 9 = f. Let m be the order of 
g- Determine whether RTT |= g or RTT FE ~g. We give the argument for the 
case RTT F g; the argument for =g is easy. If RTT } g then by Corollary 
3.25, Rm H Vx[T(x) V ST(x)]. This implies Rm F T(fm) V 7T(fm), where 
fm is as in Theorem 3.27. By definition of T this means Rm-1 F fm or 
Rm-1 F fm, both contradicting Theorem 3.27. 


3c3 Orders and types 


RTT is based on a double hierarchy: One of types and one of orders. This 
double hierarchy is too restrictive. It is possible to develop Logic and 
Mathematics within RTT, but we saw that the proof of the theorem of 
the least upper bound, which is fundamental in real analysis, cannot be 
given. The origin of the problem is the use of the so-called predicative and 
impredicative propositional functions. 

It is therefore interesting to notice the relation between orders in RTT 
and levels of truth in KTT, as formulated in Theorem 3.24. It shows that 
Kripke’s system can be regarded as a system based on RTT of which not 
the orders, but the types have been removed. In this way, KTT can be seen 
as a system that is dual to the simple theory of types. 

KTT however, has a more subtle approach than many type theories as it 
does not exclude any, possibly “paradoxical”, expression from the syntax, 
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which is the usual type-theoretic approach. If an expression is paradoxical, 
it will not get a truth value at any level a of the hierarchy of Truths. 
Whether an expression is paradoxical or not does not only depend on its 
syntactic structure, but also on the domain A and the relations of R on 
A (see Example 3.29). So paradoxes are only excluded at the level of 
semantics. 

The discussion above shows that the orders of RTT are not to be blamed 
for the restrictiveness of RTT. KTT is a system which contains orders but 
has only few restrictions towards self-application. It is the combination of 
orders and types that makes RTT restrictive. 

The special structure of KTT makes it possible to define a notion of 
semantic order in RTT: 


Definition 3.31 Let f € P have type t°. The semantic order of f is the 
smallest natural number n for which either R, } f or Ra F ~f. 


By Theorem 3.24, the semantic order of f is always smaller than or equal 
to its (syntactic) order. 


Conclusions 


We saw in Section 3al that the Ramified Theory of Types is very restrictive 
for the description of mathematics within logic, because it is not possible 
to formulate impredicative definitions in RTT. 

This was already realised by Russell and Whitehead, who tried to solve 
this by postulating the Axiom of Reducibility 3.3. This axiom has been 
criticised from the moment it was written down, both by Russell and White- 
head themselves and by others. Ramsey, Hilbert and Ackermann deramify 
RIT: They remove the orders. They observe that this does not lead to 
known paradoxes as long as a proper distinction between language and 
metalanguage is made. 

Gödel and Quine observe that the deramification does not violate the 
Vicious Circle Principle, as long as one accepts that objects and pfs exist 
independently of our constructions. 

So historically speaking, one could say that the orders were blamed 
for the restrictiveness of RTT. In Section 3c we showed that this is not 
correct. We used the formalisation of RTT that was given in Chapter 2 
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to compare RTT with Kripke’s Theory of Truth KTT. We established the 
‘relation between Russell’s hierarchy of orders and Kripke’s hierarchy of 
truth-levels. In particular we showed that: 


1. A proposition of RTT of order n is true if and only if its interpretation 
is true at level n in Kripke’s Truth Hierarchy (Theorem 3.24); 


2. The truth ofsome propositions of order n of RTT cannot be established 
in KTT at a level of truth hierarchy smaller than n (Theorem 3.27). 
Yet for some other propositions, it can be established at an earlier 
level (Example 3.28). 


We also saw that Russell's theory is more restrictive than Kripke’s. On 
the one hand, all propositional functions of RTT can be coded in Kripke’s 
Truth Theory; on the other hand there are formulas of Kripke’s theory that 
cannot be expressed in RTT, respecting both hierarchies. 

We feel that the orders are not to be blamed (alone) for the restrictive- 
ness of RTT. KTT clearly has a structure with orders (see Definition 3.31); 
nevertheless it is possible to give impredicative definitions (see Example 
3.30). 

Russell excludes all propositional functions that might lead to paradox- 
ical situations beforehand. Kripke does not exclude them, though it is not 
guaranteed that each proposition gets a truth value. This may depend on 
the model chosen (see Example 3.29). 

Whether the orders should be blamed or not, the main line in the history 
continues with non-ramified theories. For example, Church’s combination of 
A-calculus with simple type theory, the basis for most modern type systems, 
has no orders. 


Chapter 4 


Propositions as Types and 
Proofs as Terms 


In this chapter we discuss the notions of Propositions as Types and Proofs as 
Terms (both abbreviated as PAT). These notions have played an important 
role in the development of Type Theory after the Second World War. They 
opened the possibility to use Type Theory not only as a restrictive method 
(to prevent paradoxes) but also as a constructive method. Many proof 
checkers and theorem provers, like AUTOMATH [95], Coq [42], Nuprl [34], 
LEGO [87], LF [59], use the PAT principle. 

Par was discovered independently by different persons. In Section 4a 
we give a historical sketch. In the next two sections we describe how the two 
most important type systems of the pre-PAT-era can be described in a PAT 
style. This gives insight in the various ways in which PAT-implementations 
can be made. 


4a The discovery of PAT 


In the first three chapters we discussed type systems the way they were 
initially designed, namely to prevent the logical paradoxes. But although 
the systems of both Russell and Church have some logical symbols in them 
(like V, V), these theories themselves cannot be seen as a logical system. If 
one wants to make logical derivations, one has to build a logical system on 
top of one of these type systems. 
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However, type theory nowadays also plays an important role in logic in 
a different way: It can be used as a logical system itself. This use of type 
theory is generally known as “propositions as types” or “proofs as terms”. 
As we will see in this section, both expressions only partially cover the idea 
of using type theory as a logical system. As they both abbreviate to PAT, 
we will use this abbreviation to indicate both “propositions as types” and 
“proofs as terms”. 

“Proofs as terms” already suggests an important advantage of using 
type theory as a logical system: In this method proofs are first-class citizens 
of the logical system, whilst for many other logical systems, proofs are 
rather complex objects outside the logic (for example: derivation trees), 
and therefore cannot be easily manipulated. 

Below we mention some origins of the PAT principle. 


4al Intuitionistic logic 


The idea of PAT originates in the formulation of intuitionistic logic. Though 
it is not correct that “intuitionistic logic” is simply the logic that is used 
in intuitionistic mathematics!, there are frequently occurring constructions 


1 «Intuitionistic logic” is standard terminology for “logic without the law of the ex- 
cluded middle”. The terminology suggests that it is “the logic that is used in intuition- 
ism”. However, intuitionism, that is: the philosophy of Brouwer and the mathematics 
based on that philosophy, declares mathematics to be independent of logic. According 
to that philosophy, a proof of a mathematical theorem is a method to read that theorem 
as a tautology. The fact that one needs a list of tautologies before the proof of more 
complicated theorems becomes clear, only indicates that the constructions we make are 
too complicated to be comprehended immediately. Mathematics itself however, is a con- 
struction in one’s mind, independent of logic: 


“Een logische opbouw der wiskunde, onafhankelijk van de wiskundige 
intuïtie, is onmogelijk — daar op die manier slechts een taalgebouw wordt 
verkregen, dat van de eigenlijke wiskunde onherroepelijk gescheiden blijft 
— en bovendien een contradictio in terminis — daar een logisch systeem, 
zoo goed als de wiskunde zelf, de wiskundige oer-intuïtie nodig heeft” 


(Over de Grondslagen der Wiskunde [19], p. 180) 


(A logical construction of mathematics, independent of the mathematical intuition, is 
impossible — for by this method no more is obtained than a linguistic structure, which 
irrevocably remains separated from mathematics — and moreover it is a contradictio in 
terminis — because a logical system needs the basic intuition of mathematics as much 
as mathematics itself needs it. [Translation from [63]]). 
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in intuitionistic mathematics that have a logical counterpart. One of these 
constructions is the proof of an implication. Heyting [62] describes the proof 
of an implication a > b as: Deriving a solution for the problem b from the 
problem a. Kolmogorov [77] is even more explicit, and describes a proof of 
a > bas the construction of a method that transforms each proof of a into a 
proof of b. This means that a proof of a > b can be seen as a (constructive) 
function from the proofs of a to the proofs of b. In other words, the proofs 
of the proposition a = b form exactly the set of functions from the set of 
proofs of a to the set of proofs of b. This suggests to identify a proposition 
with the set of its proofs. Now types are used to represent these sets of 
proofs. An element of such a set of proofs is represented as a term of the 
corresponding type. This means that propositions are interpreted as types, 
and proofs of a proposition a as terms of type a. 


4a2 Curry 


PAT was, independently from Heyting and Kolmogorov, discovered by Cur- 
ry and Feys [38]. In paragraph 8C of [38], Curry describes so-called F- 
objects, which correspond more or less to the simple types of Church in 
[30]. As a basis, a list of primitive objects 31, ¥2,... is chosen. All these 
primitive objects are F-objects. Moreover, if a and 6 are F-objects, then 
so is Faß. Here, F is a new symbol. Faß must be interpreted as the class 
of functions from a to 8. If a is an F-object, then the statement F aX 
must be interpreted as “the object X belongs to a”. The following rule-F is 
adopted: Jf FXYZ and t XU then + Y(ZU). The intuitive meaning of 
this rule is: If Z belongs to FXY and U belongs to X, then ZU belongs to 
Y. This rule immediately corresponds to the application-rule of A-Church 
(see Ab). 

Earlier in [38], Curry has introduced the combinator P, which is the 
implication combinator. PXY can be interpreted as the proposition “if X 
then Y”. The combinator P comes together with a rule-P: If PXY and 
+ X then F Y. Curry notices that this rule has similar behaviour as rule-F. 

Curry is the first one to give a formalisation of PAT. For each F-object 
& he defines a corresponding proposition a? as follows: 9P = v; and 
(Fap)? = Pa? BP. Remark that Curry’s function a + a” is in fact an 
embedding of types in propositions (so a types-as-propositions embedding 
instead of a propositions-as-types embedding). 
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Curry then derives the following theorem, where F,,X1---XmY is an 
abbreviation of FX (FX2(...(FXmY)...)): 


“If H Fmt -+-€mmX then F (Fréi `- Emn)”. 

Moreover, if F Fimé1---&m7X is derivable from the premises 
F aja; (i = 1,...,p) then H (Fmt Emm)” is derivable from 
the premises + af (i =1,...,p).” 


([38], paragraph 9E, Theorem 1) 


In other words: If there is (under certain type conditions a;:a;) an 
object X that is a function taking arguments of types £1,...,&m, resulting 
in an object of type 7, then the corresponding theorem is derivable (if 
we presuppose ar ). Or in short: The types-as-propositions embedding 
at+ a? is sound. 


The converse of the theorem holds as well: 


“If H (Fmé1:: Emn) is derivable by rule-P from the premises 
+ af, then for each derivation of this fact and each assignment 
of a1,...,Qp to @1,...,Q, respectively there exists an X such 
that + Fm&1---&mnX is derivable from the premises F aza; (i = 
1,...,p) by rule-F alone.” 


([38], paragraph 9E, Theorem 2) 


P is com- 


In other words: The types-as-propositions embedding a ++ a 
plete. 

The treatment of PAT in [38] is mainly directed towards Propositions 
as Types. Proofs as terms are implicitly present in the theory of [38]: The 
term X in the proof of Theorem 1 of [38] can be seen as a proof of the 


proposition (Fm1 - Emn)”. But this is not made explicit in [38]. 


Example 4.1 As an example, we show the deduction of the proposition 
A — A from the logical axioms X — Y — X? (the K-aziom) and (X > 
Y — Z) 3 (X = Y) > X — Z (the S-aziom), both in the style of the 


?We assume that — is associative to the right, ie. X — Y — Z denotes X — (Y — Z) 
and not (X = Y) > Z. 
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combinator P and in the PAT-style. Both derivations correspond to the 
derivation of the proposition A — A in natural deduction style, with the 
use of modus ponens, and axioms X — Y — X and (X > Y > Z) - 
(X —Y) — X — Z only: 


H(A (AA) > A) ([A-AAAA>JAJAA 
FA(A— A) A 
H(A AA) A Á HA-A-A 
HA—A l 


e We use PmXi °- XmY as an abbreviation for 
PX;(PXo(...(PXmY)...)). 
So PA: AmY can be interpreted as the proposition 
Xi => X >- Xm Y. 


In this notation, Rule-P looks as follows: 


H Pmi Xo XmY +X 
F Pm Xi- XmY 


For terms X,Y, Z, we take the following axioms: 


(K): F P2 XY X; 


Let A be a term. From the axioms we derive + PAA, using rule-P: 


H P3(P2A(PAA)A)(PA(PAA))AA 
FP,A(PAA)A 
TT EPAPAPAAAA — HPA(PAA) — 


F PAA 


e In PAT-style, the situation is similar. Now we do not use any axioms, 
but we use some standard combinators. The combinator K (which 
can be compared to the A-term Axy.x) has type F2X YX, for arbi- 
trary F-objects X,Y (a term can have more than one type in Curry’s 


124 4 Propositions as Types and Proofs as Terms 


theory). K can be seen as a “proof” of the axiom (F,XYX)”. This 
is indicated by putting K behind the axiom: 


(FoXY X)PK. 


The combinator S, comparable to the A-term Axyz.xz(yz), has type 
F3(F2XYZ)(FXY)XZ for arbitrary F-objects X,Y, Z. S is a “proof” 
of the axiom (Fz(FaXYZ)(FXY)XZ)”. This is denoted as 


(F3(FoXY Z)(FXY)XZ)?S. 
The derivation above now translates to: 
H F3(F2A(FAA)A)(FA(FAA))AAS 
+ Fo A(FAA)AK 


TF RFARAA AAR) RAPA) 


F FAA(SKK) 


The conclusion of this derivation can be read as: SKK is a function 
from A to A, or, with PAT in mind: SKK is a proof of the proposition 
AvA. 


Both derivations correspond to the derivation of the proposition A — A in 
natural deduction style, with the use of modus ponens, and axioms X — 
Y > X and (X => Y => Z) > (X = Y) X — Z only: 


H(A (AA) A) > (A>~A-A)G AOA 
HFA-(A-A)>A 
F(A+=A—>A)> AGA FA-JA—A 


HAA 


4a3 Howard 


Howard [66] follows the argument of Curry and Feys [38] and combines it 
with Tait’s discovery of the correspondence between cut elimination and 
B-reduction of A-terms [116]. 


Example 4.2 The idea is as follows. Consider the following derivation in 
natural deduction style of a proposition B: 
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[A] 

B 

A-B A 
B 


Here, [A] denotes that the assumption A has been discharged at the point 
where we concluded A — B from B. D; is a derivation with some assump- 
tions of A, and conclusion B, whilst Ds is a derivation with conclusion A. 
The derivation D2 can be used to replace the assumptions of A in derivation 
Dı. This means that we can transform the derivation to: 


A 


B 


where copies of D2 have replaced the assumptions A in D4. 
We can decorate the two derivations above with A-terms that represent 
proofs. This results in the following two deductions: 


(x: A] 
T:B 
(Ax: A.T) : (A — B) S: A 
(AxrAT)S):B 


and 


S:A 


T|z:=8] : B 


The assumption of A is represented by a variable x of type A. This is a 
natural idea: the variable expresses the idea “assume we have some proof of 
A”. The derivation D; is represented by a A-term T, in which the variable 
x may occur (we can use the assumption A in derivation D1). Then the 
term Ax:A.T exactly represents a proof of A — B: it is a function that 
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transforms any proof x of A into a proof T of B. As Dz is a derivation of 
A (assume, S is a proof term of A), we can apply Ax:A.T to S, obtaining 
a proof (Ax:A.T)S of B. 

Now substituting the derivation D2 for the assumptions of A in Di is 
nothing more than replacing the assumption “assume we have some proof 
of A” by the explicit proof S, or in other words: substituting S for x. This 
results in a term T, where each occurrence of x has been replaced by S: 
the A-term T[x:=S]. 

We see that the proof transformation exactly corresponds to the 6- 
reduction (Ar:A.T)S —g T[x:=s]. 


This is the first time that proofs are treated as \-terms. Howard doesn’t 
call these A-terms “proofs” but “constructions”. 

Moreover, Howard’s treatment of PAT pays attention to both Proposi- 
tions as Types (following the line of Curry and Feys) and Proofs as Terms 
(by using A-terms to represent proofs, thus following the interpretation of 
logical implication as given by Heyting). 

Howard’s discovery dates from 1969, but was not published until 1980. 


4a4 De Bruijn 


Independently of Curry and Feys and Howard, we find a variant of PAT 
in the first AUTOMATH system of De Bruijn (AUT-68 [95], [21]). Though 
De Bruijn was probably influenced by Heyting (see [23] in [95], p. 211), 
his ideas arose independently from Curry, Feys and Howard. This can be 
clearly seen in Section 2.4 of [20], where propositions as types (or better: 
Proofs as terms) is implemented in the following way, differing from the 
method of Curry and Howard. 

First, a constant bool is introduced. bool is a type: The type of 
propositions. If b is a term of type bool (so b is a proposition), then 
true(b) is a primitive notion of type type. true(b) represents the type of 
the proofs of b. So, a proof of proposition b is of type true(b) and not 
of type b (since propositions themselves are no types) With this “bool- 
style” implementation (as it was called by De Bruijn in [23]) in mind, it 
becomes clear why De Bruijn prefers the terminology “proofs as terms” to 
“propositions as types”: In the bool-style implementation, propositions are 
not represented as types. Only the class of proofs of such a proposition is 
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represented as a type. Proofs however, are represented as terms, just as 
in Howard’s implementation of PAT. So in the bool-style implementation, 
the link between proposition and type is not as direct as the link between 
proof and term. The implementation of Howard (called “prop-style” by 
De Bruijn) does not make any distinction between a proposition and the 
type of its proofs. 

The bool-style implementation has as advantage that one does not need 
a higher order lambda calculus to construct predicate logic. In relatively 
weak AUTOMATH systems such as AUT-68 one usually finds a “bool-style” 
implementation of PAT. It would be impossible to give a “prop-style” im- 
plementation in such a system as its A-calculus is not strong enough to 
support it. In AUTOMATH systems with a more powerful A-calculus we also 
find “prop-style” implementations. See [92] for a global description of how 
prop-style implementations are made in AUTOMATH. 

Another advantage of the bool-style implementation is that one does 
not depend on a fixed interpretation of the logical connectives. One is free 
to define ones own logical system (and it is possible to base that system on 
the Brouwer-Heyting-Kolmogorov interpretation of the logical connectives, 
just like the prop-style implementation of PAT). This has been one of the 
reasons for De Bruijn to implement PAT in a bool-style way (see [23]). 

Though the bool-style implementation has disappeared from later AU- 
TOMATH systems, it is still in use in the Edinburgh Logical Framework [59], 
and the systems proposed in certain formulations of the Calculus of Con- 
structions by Luo [86], Streicher [115], and Altenkirch [3]. Luo defines a 
class of proofs, called Prf(P), for each proposition P, and explicitly defines 
how to construct a proof of a proposition of the form Vz:Á.P by giving a 
rule 

T,x:AFM : Prf(P) 
TH (Ar:A.M) : Prf(vz: A.P) 


Streicher and Altenkirch have a somewhat different approach. They also 
define a class of proofs for each proposition P (in Altenkirch’s thesis this 
class is denoted El(P)), but the proofs are constructed in a somewhat dif- 
ferent way: An equality relation ~ is introduced, and the class of proofs 
El(Vz:A.P) is explicitly declared equivalent to the type Iv: A.El(P): 


El(V2:A.P) ~ Ix:A.El(P). 
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This type IIz:A.EI(P) represents the class of functions f with domain A, 
and such that for each a in A, fa is an element of El(P)[x:=a], so a proof 
of Plx:=a]. 


4b RTT in PAT style 


In this section we show that the system RTT of Chapter 2 can be described 
in a PAT style using the prop-style implementation of Curry and Howard. 
This will give us a better view on the various ways in which PAT can be 
implemented. 

Before we can give a description, we must make the following observa- 
tions: 


e Russell and Whitehead designed their system for classical logic. As 
the PAT principle in prop style is based on intuitionistic logic, we need 
to supply extra logical axioms to obtain the classical logic of Russell 
and Whitehead; 


e RTT is constructed with the logical connectives V, =, V, while the 
PAT principle is strongly based on the interpretation of — and V 
as function types. In the sequel of this section we will work with 
the symbols V, —, and an additional symbol L, representing falsum. 
This makes it possible to interpret a proposition ~f by f — L, and 
a proposition f Vg by (f > L) — g; 


e As RTT distinguishes between propositions of various orders, it is not 
enough to provide one class of types. We must distinguish between 
several classes of types, corresponding to the orders of RTT.’ 


In Section 4b1-4b3 we present a type system ARTT for a PAT representa- 
tion of RTT. The type system is (almost) a so-called pure type system. Pure 
type systems (PTSs) were invented by Terlouw [118] and Berardi [13] as a 
framework in which many type systems can be described. While defining 


3Something similar is done in the proof checker Nuprl, where type universes U1, U2,... 
are introduced. The type universe U„ contains all objects of order < n. The approach 
below is not exactly similar to the Nuprl approach. In Nuprl, U„ is not only a subset 
of Un+ı, but also an element of Un+1. See [69] (where these type universes are denoted 
*1,*2,.--), [67], [34]. This is the case neither in RTT nor in system ARTT below. 
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ARTT we try to give sufficient intuition about PTSs in order to understand 
the ideas behind ARTT. A short overview of PTSs is given in Appendix A. 
More details can be found in [5], [54] and [55]. 

In Section 4b4 we make a comparison between ARTT and RTT, and in 
Section 4b5 we give some examples on how to “do” logie of the Principia 
in ARTT. 

In Section 4b6 we discuss in which ways PAT can be implemented. The 
various implementations lead to different level structures within the result- 


ing PTS. 
4b1 An introduction to ARTT 
We now present a system ARTT that will be suitable for a PAT representation 


of RTT. 


Definition 4.3 (Terms of ARTT) Let A, V and R be as in Chapter 2. 
Define the set 7 of terms of ARTT by: 


T = "le [One LAIVIR [ul LI 
TT | AV:T.T | UV:T.T. 


Remark 4.4 In this definition, *,, * (for n € N?) and On (for n € Nt) 
are so-called sorts. 


e x, is the sort of object types. There will be only one object type in 
ARTT, namely the type of individuals «. An individual a has type ¢, 
this is denoted: a: 1; 


è « (for n € N*) is the sort containing the propositions of order < n. 
Notice that these propositions will be represented as types: We are 
‚presenting a PAT version of RTT; 


e On (for n € N?) contains *,, and (translations of) the ramified types 
of order n as they occur in RTT. 


We write S for the set of sorts {*,*, D1, *2, Da, … }. 


Remark 4.5 A term of the form IIz:A.B denotes a (dependent) function 
type. That is: IIr:A.B is the type that contains functions f with domain 
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A, and range U,.4 B (it is possible that x occurs free in B), such that fa 
has type B[x:=a] for all a of type A. Such function types can occur at 
several places: 


e First of all, the translations of the propositional functions of a certain 
ramified type will belong to a function type. Look for instance at 
the ramified type (0°)*. Pfs of this type are functions that take an 
individual as argument, and return a proposition of order < 1 as 
result.4 This suggests to translate the ramified type (09) by the 
function type IIx:u.+, (see the forthcoming Definition 4.13); 


Secondly, certain propositions will be represented as function types. 
This has its origin in Heyting’s description of the proof of an impli- 
cation as a function. If the proof of an implication is a function, then 
the implication itself (a proposition, or equivalently, the type of all 
the proofs of a proposition) must be a function type. For example, 
the RTT proposition R(a) — R(b) will (in 4.13) be translated into the 
type IIx:Ra.Rb. 


A universal quantification will also be translated as a function type. 
According to Heyting, the proof of a proposition Vr: A[B] is a function 
that takes elements a of A as arguments, and returns proofs of B(a). 
For example, the RTT proposition Vx:0°[R(x)] will (again: In 4.13) be 
translated by IIx:ı.Rx. Here we see an example of a type IIz:A.B 
where B depends on the variable x. The intuition of a function of 
type IIx:z.Rx coincides with the intuition of a proof of the proposi- 
tion Vx:0°[R(x)]: A function taking individuals a as arguments, and 
returning a proof of R(a). 


The type system ARTT will have a rule to introduce terms of type 
IIr:A.B (provided we have a term of type B): If b is a term of type B 
(possibly x occurs free in b), then Ar:A.b is of type IIr:A.B. This can 
be understood with the above interpretations of IIz:A.B in mind. For in- 
stance, we represented the RTT-proposition Vx:0°[R(x)] by IIx:u.Rx. If we 


‘The proposition that is returned can be of order 1, for instance in the case we 
substitute the individual a for x in the pf Vy:0°[R(x, y)], or of order 0, for instance in the 
case we substitute a for x in the pf R(x,x). However, the returned proposition can never 
be of order > 1, due to Corollary 2.63. See also Remark 4.7. 
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have a term b of type Rx (so b proves Rx for an arbitrary individual x) then 
the function Ax:ı.b, assigning b[x:=a] to each a € ı, is indeed a proof of the 
proposition Vx:0°[R(x)]. 

The system ARTT also has a rule that shows what can be done with a 
term of type IIz:A.B. If a is a term of type +, and f a proof of IIx:0°.Rx, 
then we can apply f to a, thus obtaining fa of type Rx[x:=a], which is Ra. 
Indeed applying the function f to an individual a gives a proof of Ra. 


Remark 4.6 It is usual to write A — B for IIr:A.B, if x does not occur 
free in B. Hence, in notation there is hardly any difference between the 
RTT proposition R(a) — R(b) (where — denotes logical implication) and 
the ARTT type Ra — Rb (where — is used to form a function type). 


Remark 4.7 One may wonder why we chose x, to be the type of all propo- 
sitions of order < n instead of the type of all propositions of order = n. This 
has a technical reason that already popped up in footnote 4, namely the 
translation of the ramified types of RTT into A-terms. Consider a ramified 
type (11° ,...,tar)®. If ri, , Tn are appropriate translations of t... tar, 
then one would like to translate (¢]?,...,¢%")° into 7) — -+ — Tn — +m, 
where *,, represents the type of propositions of some order m. This would 
suggest that a propositional function f of type (f',...,t%")* always results 
in a proposition of (fixed) order m as soon as values of types ¢{?,..., ta" 
are substituted for its free variables. However, it is impossible to determine 
such m: 


e Consider the pfs f = z(a) and g = z(a) V Vz/:(0°)'[z'(b)] in a context 
2 2 
Pe {2:(0°)'}. Observe: TE f: ((0°)') and FE g: ((0°)") 
But if we substitute R(x) for z in both f and g, we obtain R(a) (a 
proposition of order 0) and R(a) v Vz’:(0°)'[z’ (b)] (a proposition of 
order 1). 
So substituting the same propositional function R(x) in two differ- 
ent propositional functions f and g of the same type, may result in 
propositions of different orders; 


e Let f be as above, and extend T with the declaration x:0°. Notice: 
Both hy = R(x) and ha = R(x) V Vy:0°[S(y)] are pfs of type (0°)', 
so they can be substituted for z in f. As a result we obtain R(a), 
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a proposition of order 0, and R(a) V Vy:0°[S(y)], a proposition of or- 
der 1. So substituting different propositional functions hy, h2 of the 
same type in one and the same propositional function f may result 
in propositions of different orders. 


Therefore, we cannot interpret *m in TI — +- — Tn — *m as the 
type of propositions of order m. However, by Corollary 2.63, the order of 
the proposition f[x1,...,%n:=91,---,;9n] cannot be higher than the order 
of f. Hence, it is safe to translate the ramified type (t]',...,t@r)” into 
Ti +++ — Tn — *m, if we interpret * as the type of propositions of order 
sm. 

One might wonder whether this somewhat unusual interpretation of +, 
makes ARTT much different from the original RTT. The idea of “propositions” 
of order < m” however, is not very different from the idea of “propositions 
of order m”, as all propositions of order < m have a logically equivalent 
proposition of order m (at least in the logic that Russell and Whitehead 
had in mind): If f has order < m then 


f vW2:()"""[2() V 72()] 


is of order m, and logically equivalent to f. The system ARTT will have 
a special rule (Incl) stating that any proposition of order < m is also a 
proposition of order < m+ 1. 


Contexts, as in the situation of Chapter 2, contain information on the 
types of variables. These variables can be of different nature: First of all, 
we still have the variables of Chapter 2. These variables live at the level of 
types, as propositions are interpreted as types. But now we can also have 
variables that serve as assumptions: If A is a proposition, then a variable 
of type A refers to an arbitrary proof of A. If a context contains such a 
declaration z:A, then this can be read as: It is supposed that A holds (and 
that x is some proof of A). 

In Chapter 2, contexts were sets. In particular, the order in which the 
various declarations were mentioned, did not matter: {x:0°, y:()*} is not 
different from {y:()*, x:0°}. Now, types that occur in a context can depend 
on variables that are declared somewhere else in that context. We do not 
want a variable to occur in a context before its type has been declared. 
Therefore, we present contexts as lists. The order in which the variables 
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are declared is determined by the order in the list: In a context (x:7, y:v), 
the variable x was declared before y. Thus it is possible that v depends on 
x, i.e. x may occur as a free variable in v. On the other hand, y must not 
occur in 7, as y is declared after r has appeared. Therefore, the context 
{x:7, y:v) is different from the context (y:v,x:r). 

The presented intuition leads to the type system in Definition 4.9 be- 
low, which is in fact almost a PTS. The II-form rule determines which 
types (and, with the PAT principle in mind: Which propositions) can be 
constructed in the system. The general form of the II-formation rule is 


rH A:S T,x:AFr B: 53 
T F- (Hz:A.B) : 53 i 


where 51,82, 83 are sorts. By specifying for which combinations of triples 
(s1, 82,53) the II-formation rule can be applied, one can control which H- 
types can be constructed. We now informally discuss which II-formation 
rules we need in a PAT description of RTT. After that, we give a formal 
presentation of the system ARTT. 


4b1.1 The translations of the ramified types 


To translate and type the ramified types, we need the rules (*,, On, On) 
and (Om, On, On) for n > 1 and m <n. 

Assume (t}!,...,tp°)” is a ramified type, and that we already found 
proper translations 7ı,...,7» for the ramified types ¢{',... A such that 


è FARTT Ti: *s if i zt; 
. HARTT Ti: Oa; if i pa b. 


(we use Harrr to denote derivability in ARTT, the PAT version of RTT). The 
type system ARTT will have an axiom rule that declares that +, has type 
Oa, therefore we have: Xp:Tp FARTT *a : Da. Now distinguish: 


e If tj” = ı then we use II-formation rule (*s, Oa, Oa): 


FiRTT Tp | *s Xp:Tp HARTT *a : Oa. 
’ 
FARTT (Ixp:Tp-*a) : Oa 
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e If t # ı then we use II-formation rule (Oa,, Oa, Oa) (notice that 

ap < a): he = 

FARTT Tp R Da, XKp:Tp FARTT *a Cha 
Fyrrr (IIxp:Tp-*a) el; 


In ARTT there will be a weakening rule so that we can weaken this conclusion 
by adding a variable xp_1: 


Xp—1:Tp-1 Fare (Ixp:Tp.*a) : Oa- 
In a similar way, we can now deduce: 

FARTT (Hxp-1:Tp-1-IXp:Tp-*a) : Og 
and so on until we have 

Harrr (xy cry. Txp:tp-*a) : Oa- 


Notice that the variables x; are only dummy variables, and do not occur in 
any of the rj. We can therefore write 


Tt > Tp > ka 
instead of (IIxy:7).:++ .[Ixp:tp.*a) : Ca. This nj + --- — Tp — *a is the 
translation of the RTT-type (¢{",...,tp”)°. 


4b1.2 The translation of the logical implication 


The translation of propositional functions will have a lot of similarities with 
the mapping ~ of Definition 2.7. A propositional function f with free vari- 
ables zj < --- < Tm will be translated into a A-term ÀZ1:71 ... Em:Tm F, 
where F is a A-term of type *n, n is the order of f, and 7),...,7m are trans- 
lations of the types of the variables z1,..., 2m. In this way, the translation 
of f will have type T} — ... — Tm — *n, which is exactly the translation of 
the type of f. In this subsection, we focus on the translation of propositions 
of the form f — g; in the next subsection we focus on the translation of 
propositions of the form Vz:A[f]. 

If f and g are propositions, we must be able to form the proposition 
f — 9. According to Russell’s definition in Principia Mathematica, f — g 
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is only shorthand for (~f) V g. In a system in PAT style, the implication 
plays a more central role, as the logical connectives ~ and V are defined 
with the use of implication. 

If f is a proposition of order m, and g has order n, then f — g 
clearly has order max(m,n). In PAT, we can obtain this effect via a rule 
(*m, *n, *max(m,n))- We can assume that F is a translation of f, and G 
is a translation of g, such that Fırrt F : *m and FarTT G : *n. Intro- 
ducing a variable x of type F does not affect the fact that G is a type of 
sort tn: XF Harrr G : tp (the system ARTT will have a so-called weak- 
ening rule that formalises this intuition). Applying the I-formation rule 
(*m*n, *max(m,n)) results in 


FARTT F: Kn x FARTT G: %n 
Harrr (Ix:F.G) : *max(m,n) 


Notice that x is only a dummy variable here, and does not occur in G. 
Therefore we can write F — G instead of IIx:F.G. This F — G is the 
translation of the RTT-proposition f — 9. 


4b1.3 The translation of the universal quantifier 


If f is a pf with x as one and only free variable, then we want to translate 
f by some term Az:7.F. With PAT in mind, we want to translate Vz:t[f] 
by IIz:7.F: A proof of Vz:t[f] must be a function that assigns a proof of 
Fa to each a of type 7. 

The construction of Ilz:r.F can be done in the following way. We can 
assume that Ar:r.F has a type that corresponds to a propositional function 
with one variable: A type of the form T — *„, where n is the order of f. 
This means that F itself must be of type *„, where n is the order of f. The 
term IIz:7.F (a translation of Vx:t[f]) should have type *„, as it represents 
a proposition of order n. Now r is a translation of a ramified type, therefore 
it is of sort *, (if7 = ı) or sort Om (ifr #1). For the construction of Ilx:r.F 
we need the II-formation rule (#.,*n,*n) or (Om, *n,*n). Notice that x is 
a free variable of f, so its order is smaller than the order of f. In other 
words: m < n. 

The rules (*,,*n,*n) can also be used to represent universal quantifica- 
tion over pfs with more than one free variable. We discuss the situation for 
two free variables; for three or more free variables the procedure is similar. 
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Let g be a pf with two free variables xı < £2, having type (tP, t3)", 
where tf is the type of zi. We assume that g has been translated into a 
term of the form Azx1:T(t{!).Ax2:T(t§”).G, of type TOF) > T7) — *n, 
where n is the order of g. With the definition of ~ in mind, we translate 
Vxrı:t]'[g] into 

Azo: T(t) ir T(t )-G, 


and Yzz:t3° [g] by 
AT) T(82).G 


We can form Ilzı:T(t]').G in a similar way as Ilx:r.F above, using II-for- 
mation rule (*5,*n,*n) or (Om, *n,*ņn) for an n > 1 andan m < n. The 
term Iz T (ti )-G has type *n, and Azo: T (t7) Iz T (ti )-G will have type 
T(13?) — *n, which is the appropriate type for a translation of Vxy:t7" [g]. 
Similarly, Nz: T(t3).G has type *n, and Azi: T (tF ) z2: T (t3 ).G has type 
T(t{?) — *n, which is appropriate for a translation of Ysz:t3° [g]. 


4b2 The system ARTT 


Now that we have explained the way in which types are constructed in ARTT, 
we present the system in a formal way. As announced, it will (almost) have 
the form of a Pure Type System (PTS). 

A PTS always has five fixed rules, and two “flexible” rules (axioms and 
II-formation rules): 


(Axioms) Axioms provide the types of certain constants that are used in 
the system. This rule is flexible: The axioms may vary from one PTS 
to another. In ARTT, we have the following axioms: 


e +, : On for n > 1. See Remark 4.4; 

e |: +*,. This means that we consider falsum to be a proposition 
of the lowest order; 

e t:*s. See Remark 4.4; 

e a:tforae€ A. Each individual belongs to the type ı of individ- 
uals; 

e R:ı= tx for RER. This illustrates that R is an 

a{R) times ı 

a(R)-ary relation on individuals. 
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All axioms are derivable in an empty context; 


(Start) If we want to introduce a variable, we can only do this if we assign 
a type to such a variable. In the type systems that we saw before (like 
RTT and À—), it was always clear what the types of a certain type 
system are. There were always two definitions: First a definition that 
describes which types are allowed, and then a definition that describes 
which terms have what type. In a PTS however, terms and types are 
mixed up in one derivation system. Types are recognised as follows: 


e Each sort se S is a type; 
e IFTA A:s foras € S then A is a type (within the context I). 


In the second case, we see that the type A has also a type itself, 
namely: s. For these “typable” types we make it possible to introduce 
variables via the start rule: 


THA:s 
T, cA FTA 


This means that we can only introduce a variable of type A if A 
itself has a type s € S. In ARTT, this is possible for the sort +, 
(because this sort has type D„). This means that it is possible to 
introduce variables for propositions of order n. However, O, cannot 
be typed itself in ARTT. This is not harmful, as we do not want to 
introduce variables of type O,. Such a variable would be a variable 
for a ramified type of order < n, and such variables de not occur in 
RTT, either. 


As the introduced variable must be seen as a new object, we demand 
that x is “fresh”: It must not occur anywhere in I’ or A; 


ee Once we have derived T + M : N, we want to be able to 
add variables to the context [. Compare this to the weakening rule 
2.45.5 for RTT. To add such a variable to the context, we take the 
same precautions as in the start rule, so we have a premise [F A: s 
if we want to add a variable of type A to I. The weakening rule now 


becomes: 
TEM:N TEKA:s 


T,z:AFM:N 
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Again, we demand that x is fresh: It must not occur in T, M, N, or 
A; 


(II-formation) This rule describes which II-types can be constructed. It 
is flexible: In different PTSs, different II-types may be allowed. We 
already discussed this rule in Section 4b1. In Subsections 4b1.1-4b1.3 
we described which II-formation rules are needed for ARTT: 


e (xs, On, On) and (Om, On, On) (m < n) to translate the ramified 
types; 

© (Ams *n;*max(m,n)) to translate the logical implication, 

© (*s,*n,*n) and (Om, *n,*n) (m < n) to translate the universal 
quantification; 


(H-introduction) This rule describes how we can form terms of type 
IIz:A.B, once we have established that the type Ilx:A.B can be con- 
structed. IfT H- (IIz:A.B) : s has been derived (using the II-formation 
rules), we can not only introduce variables of this type with the start 
and weakening rules, but we can also form terms of this type by à- 
abstraction: 


T,r:AH b:B [+ (Nx:A.B):s 
TH (Ar: A.b) : (IIx: A.B) 


This rule is a modern version of the Abstraction Principles 1.1 and 
1.2, and the abstraction rules 2.45.3 and 2.45.4 for RTT. 


The H-introduction rule is also called A-formation-rule; 


(IN-elimination) This rule describes what can be done with a term f 
of type IIr:A.B. The intuition is clear: f is a function that takes 
arguments of type A, so it should be possible to apply f to any a of 
type A. And indeed: 


TH f:Ie:A.B Tha:A 
T+ fa: Bir:=a] 
The substitution in the type of the result is necessary. This can be 


seen if we interpret Ilx:A.B as a proposition Vr:A.B. Then f is a 
proof of Vx:A.B, and fa is a proof of B where x has been replaced by 
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a, so B[x:=a]. The II-elimination rule together with G-reduction can 
be seen as a modern substitution rule. We already saw in Chapter 
2 that substitution in RTT can be seen as function application plus 
-reduction to normal form in A-calculus. Indeed: If we apply a term 
Ar: A.b to a term a, we get (Av:A.b)a, which -reduces to b[x:=a]. 


The Il-elimination rule is also called application-rule; 


(Conversion) If a term A ß-reduces to a term A’, we consider A and A’ 
to be equal, in some way. A property of PTSs is that f TH A: B and 
A —g A’, then also TH A’: B (the so-called “Subject Reduction” 
property). However, we do not have the “Type Reduction” property 
that TH A: B' whenever + A: B and B —4 B'. This property 
does not even hold if we demand that TH B’: s for some se S. As 
we want to have that ß-equal types have the same inhabitants, we 
introduce a conversion rule: 
TFA:B re B':s B=, B' 
Tt A: B’ ' 


As was already observed in the motivation for the start rule, there is an 
important difference between PTSs and the other type systems that have 
appeared in this thesis up till now. The other systems always made a clear 
distinction between terms and types. The definition of types is usually 
given first, and then a second definition indicates which terms are of what 
type. This distinction is not made in PTSs. There, types and terms are 
defined in one system. The jargon used may be confusing to the reader that 
is not familiar with PTSs, and therefore we give the following definition: 


Definition 4.8 (Terms and Types) 
Terms 


e A term A is a legal term in a context T, if there is B such that 
rT-A:BorrtFHB:A; 
e Ais a term of type B in a context [, ifT F A: B; 
Types 


e A term A is a type in a context [ if there is a sort s € S such 
that TF A: s; 
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e Ais the type of B in a context T if T F- B: A; 


e A type A is inhabited in a context T if there is a term B such 
that TF B: A. 


So: a type is always a term. And: a term can sometimes act as a type. 
Look for instance at the Il-introduction rule. In the right-hand premise, 
IIz:A.B is a term of type s. But in the conclusion, it acts as a type: the 
term Ar: A.b is of type Iz: A.B. 

For ARTT we need a rule in addition to the seven rules that were stated 
for PTSs above. That is why we said that ARTT is almost a PTS. This is 
the so-called incluston-rule, which describes the intuition behind *, as the 
class of propositions of order < n (and not only the class of propositions of. 
order n). We can formulate this rule as follows: 


THA: *, 
EE Are 


We summarise the eight rules in the following definition: 
Definition 4.9 (Derivation Rules for ARTT) Let s,sı,32 range over 
S = {x;, 01, 02, „41, *2,... }, and let 
R= {(+s; On, On) | n2 1} U 
(Om, Dn, Do) | 1 <m<n}u 
{(*m, *n, *max(m,n)) |m, n> 1} U 


(Garri) | n> 1} U 
{(Om,*n,#n) | 1 Sm <n}. 


The derivation rules for ARTT are as follows: 


(Axioms) ais (n > 1) 
kt: s 
FL: 
Hart (a € A) 


FRit tn (RER) 


a(R) times ı 
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(Start) a 
(II-form) TFA: 81 T,c:AE B: s2 (81, 82,63) € R 


T F (INe:A.B): s3 
(I-in) T,2:Atb:B PE (Nx:A.B):s 
T F (Ar: A.b) : (IIx: A.B) 


THEM :IHx:A.B TEN:A 
(Hel TE MN : BEEN 


TrA:B TtB':s B = B' 
(Conv) TrA:B 
ThHA:x 
(Incl) Tea. 
The introduced variables x in the Start and Weakening rules are assumed 
to be fresh. If confusion with derivation rules of other type systems might 
arise, we use -ırrr instead of H to indicate derivability in ARTT. 


4b3 Meta-properties of ARTT 


We now describe some meta-properties of ARTT. Their formulation is very 
close to the formulation of the usual meta-properties of PTS, as described 
in Section Ac of the Appendix. However, there are a few deviations. This 
is due to the rule (Incl), which is not a rule in PTSs. The proofs of the 
meta-properties below are as in the standard literature on PTSs ([5], [55]). 
In Remark 4.12 we provide some intuition behind these meta-theorems and 
compare them with the meta-theorems we found for RTT in Chapter 2. 


Theorem 4.10 (Properties of ARTT) 
1. (Church-Rosser) If A —», Bı and A —g Ba then there is a C such 
that Bı 29 C and Ba == C; 


2. (Free Variables) Let T = 11:Aj,...,2n:An, and assume T+ A: B. 
Then: 


e FV(A) UFV(B) C pom (I); 
e For alll <i <n there iss; € S such that 
21:1, Es Li: Ait FH A; : Si; 
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(Substitution) Assume T,x:A,At B:C andT + D:A. 
Then T, A[z:=D] + B[x:=D] : C[x:=D]; 


(Thinning) Assume T, A are legal and T CA. 
ThenTtA:B>AHA:B; 


(Generation) 
(a) TH *n: C then C = On; 
HTH.: C then C = xs; 
IfTEL:C then C = x, for somen € N; 
Fora€ A, fTha:C then C=; 
For RER, ifTtR:C then C =>- 74 x ora(R)=0 
Nm a 
a(R) times ı 
and C = *, for some n € N; 

(b) FTHx:C then there are B,s such that + B : s, x:BeT, 
and either B =g C, or there are m,n with m <n and B = *m, 
C= *n; 

(c) IfT + (Ilx:A.B):C then there are sı, 82, 83 such that (s1, 52,53) € 
R,THA:sı, andT,2:At B: s2. Moreover, C = s3 or there 
are m,n such that m < n, 83 = *m and C = *n; 

(d) If + (Az:A.b) : C then there are B,s such that T+ (IIx:A.B): 
s;T,x:AFb:B and C =, Ur:A.B; 

(e) IFT H (AB) : C then there are x,P,Q such that TH A: 
(IIz:P.Q), TH B: P, and either C =g Q[x:=B] or there are 
m,n with m <n and Q[z:=B) = tm, C = *n; 

. (Correctness of Types) ITH A: B then there is s € S such that 


TtB:sorB=s; 


. (Subject Reduction) JT- A: B and A—g A then Th A’: B; 


. (Permutation) If T,z:A,y:B,A F C : D and x ¢ FV(B) then 


T,y:B,e:AAAFC:D; 


. (Topsort Lemma) Jf s is a topsort and T + A: s then A is not a 


variable and A is not of the form Aj Az or Ax:A1.An. 
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Theorem 4.11 (Strong Normalisation for ARTT) IfT + A: B then A 
as strongly normalising. 


PROOF: We embed ARTT into system AC of the Barendregt cube by map- 
ping * to * (for all n € N), *, to *, and Ohn to O. In this way, ARTT becomes 
a pure type system (as rule (Incl) disappears) that is a subsystem of AC. 
As all terms of AC are strongly normalising, ARTT is strongly normalising 
as well. B 


Remark 4.12 We provide some intuition for the properties in Theorems 
4.10 and 4.11. 


1. The Church-Rosser theorem is a basic theorem on A-calculus. It in- 
dicates that it does not matter in which way one makes a calculation 
(list of B-reductions): The result will always be the same (or, more 
precisely: The results can be coerced to be the same). The various 
proofs that are known even present a constructive method to find a 
common reduct C of two S-equal terms Bı and B3; 


2. The first part of the Free Variables Lemma is comparable to Lemma 
2.56.1 of RTT. 


The second part has no counterpart in RTT. This is because in RTT 
the set of types is determined by a separate definition (2.37), and 
types do not have a type themselves. In ARTT, the derivation rules 
determine which types are “allowed”, and which types are not: An 
allowed type has a sort as type. The second part of the Free Variables 
Lemma shows that types that occur in a context are, indeed, always 
typable by a sort; 


3. The Substitution Lemma can be compared to the Substitution Rule 
of RTT (see 2.45.6). Observe that the substitution [z:=D] is now not 
only carried out in B (as was the case in RTT), but also in C and A. 
This is due to the fact that in ARTT, types may have free variables; 


4. The Thinning Lemma is comparable to the Weakening Rule of RTT 
(see 2.45.5); 


5. The Generation Lemma (called “Stripping lemma” in [54]) is one of 
the most important: meta-properties of a PTS. The derivation rules 
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of ARTT are, as is the case with most usual formulations of PTSs, 
not syntax directed, i.e. the last rule in a derivation is not necessarily 
determined by the structure of the term and the context of the con- 
clusion of the derivation. This is due to rules like (Weak), (Conv) 
and (Incl). If the conclusion of the derivation is TH A: C, then the 
Generation Lemma provides information on the type of the subterms 
of A, and on the structure of C. 


The Generation Lemma for ARTT is comparable to Theorems 2.84 
and 2.85 for RTT. 


Case (a) of the Generation Lemma might raise the question why a 
relation symbol R of arity > 0 cannot have type t > --- > lt — *n 
for n > 1, while a relation symbol R’ of arity 0 can have type *n 
for n > 1. This has to do with the way in which we implemented 
type inclusion. We only declared that *„ is a subtype of *„+1 for all 
n, but did not extend this to types of the form 7; — 73. It is quite 
possible to work with type systems that have an extensive subtyping 
relation (see for instance [33]), but as we do not need an extension of 
subtyping in this Section, we do not introduce it here; 


. Correctness of Types shows that every term B for which there are 


T,A such that TH A: B is typable by a sort. Compare this to the 
second part of the Free Variable Lemma, that proves a similar thing 
for types occurring in a context; 


. Subject Reduction shows that the type of an expression does not 


change during a calculation. As there is no real reduction in RTT, we 
do not have an equivalent statement in RTT. One could see the fact 
that the Free Variable property 2.58 is maintained under substitution 
as a weak form of Subject Reduction; 


. Permutation is closely related to the Permutation Rule 2.45.7; 


. This lemma shows that the topsorts (see Definition A.34. In ARTT, 


the topsorts are *, and 0),»,....) are only inhabited by types (i.e. 
constants or terms of the form Ilx:A.B). 


SIt is possible to present PTSs in a syntax directed way. See Severi’s thesis [113]. 
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Strong Normalisation for ARTT can be compared to the theorem on Ex- 
istence of Substitution for RTT, 2.73. However, in ARTT much more re- 
ductions are possible. For instance, the proof terms of ARTT do not have 
an equivalent in RTT, and these proof terms are also A-terms that might 
B-reduce. 


4b4 Interpreting RTT in ARTT 


In this section we formally prove our claim that ARTT indeed is a PAT 
interpretation of RTT. We translate the ramified types of RTT to types of 
ARTT: 


Definition 4.13 We define a type T(t*) for each ramified type t°: 
e T(0°) = 4; 
e TE, tar )") = TE) >> TO) + *maxla,1) 


All the translations of the ramified types are typable in ARTT. As an- 
nounced in Subsection 4b1.1, we use the II-formation rules (*,,0,,0,) and 
(Om, On, On) form < n andn > 1. 


Lemma 4.14 In ARTT we can derive: 
e | T(0°) : xs; 
EN ir re: 


PROOF: Induction on the definition of T; use the rules (*,,0,,0,) and 
(Om, On, On) of ARTT as sketched in Subsection 4bl.1. & 


On the other hand: The inhabitants of *, and O, are all translations 
of ramified types: 


Lemma 4.15 
1. TH A: *, then Azu; 
2. KTH A: On then there is a ramified type t with T(t") =A. 
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PROOF: 


1. As *, is a topsort, A cannot be a variable, or of the form AıAa or 
Az:Aı.Aa (see Theorem 4.10.9). As there are no s1,s2 € S for which 
(s1,52,*5) € R, A cannot be of the form Ilx:Aı.Aa. Therefore, A 
must be a constant. By the Generation Lemma, 4.10.5(a), A = 4; 


2. Use induction on the length of A. As U, is a topsort, A cannot 
be a variable, a A-abstraction, or an application. Considering the 
Generation Lemma, 4.10.5, we conclude that A = *, (and in that case 
we are done: Take t° = ()”) or A = IIz:B1.Ba. ' 

In the last case, we use the Generation Lemma to obtain: TH Bi: s1 
and T,x:Bı F B2:82, where (s1, s2, On) € R. 
Due to the definition of R, s2 = On. Hence T,x:Bı + Bo:O,. Using 


the induction hypothesis, we can find u5?,...,uêr such that By = 
T((u3?,...,usm)”) (it is not the case that By = T(0°), otherwise we 


would have T,x:Bı F On). 

Due to the definition of R we also have sı = *s or sı = Og, for some 
aı. By induction, B, = ı, or Bı = T (uf) for a ramified type uf!. 
Hence A = Tous se our) jor A = Perez ul JP) (notice 
that aj < n, as sı = Oa and (s1, 82, On) € R). 


We extend the mapping T to propositional functions: 


Definition 4.16 Let IT be a RTT-context. We define a term T(i) for all 
i € VUA, and a term T(f) for each f € P for which the variables of f are 
contained in DOM (T): 


è T(x) =z forz E€ V; 
e T(a) =afora€ A; 


e T(Rliı,.-. »ta(R))) = àzı:T (tı)... Azn:T(tn).R(T(iı)) tee (T(iacry))- 


Here, zj < ++: < Zn are the free variables of R(i1,...,ia0r)), and 
zit; EI for 1 <2 Sn; 
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e If fi,f2 E P and T(f;) = Axj1:Aj1...ALjm:Ajm-F;, where zji < 
-++ < Ejm are the free variables of fj, then define T(fı > fo) = 
Azı:T(tı) CES A Tl (Fi = Fy). 
Here, zj < --:- < Xp are the free variables of fi — fo, and z;:t; € T 
for 1 <1 <n; 


e If f € P and T(f) = Arı:Aı... Atn:An-F, where zij < +--+ < £k < 
“er < En are the free variables of f, then define T(Vzy:t,[f]) = 


Arı:Aı Aaa ALK—-1:AR-1AT R41: Akti wie Atn:An-xy:T (te) PF; 


e If zE€ Vand ky,...,km € AUVUP, then define T(z(kı,...,km)) = 
Azı:T(tı)... Aan: T (tn).2(T(k1)) ++ (T(km)), 


where z1 < -+-+ < £n are the free variables of z(kı,...,km) and zi:t; € 
T. 


We show that the legal pfs of RTT are legal terms in ARTT. For one step 
in the proof, we need a Lemma: 


Lemma 4.17 THA:D, or Ih A:», then FV(A) = Ø. 


Proor: By Lemma 4.15, there is a ramified type t° such that A = T(t*). 
From the definition of T(t*) we conclude that Fv(T(t*)) =Ø. X 


Lemma 4.18 Let f € P and assume T F f : t° in RTT. Then hk T(f): 
T(t*) in ARTT. 


PROOF: Induction on the structure of f. Though the proof is rather 
straightforward, we treat all cases, in order to show where which rules 
of ARTT are used. 


1. f = R(i1,...,ia(R)), and £1 < °°: < Tm are the free variables of f. 
Notice: 2;:0° € T for 1 < i < m (Theorem 2.85). Therefore: T(f) = 
ALL... ALM. Riy +: ta(R): Notice that zi:t,...,Em:t F Rij: ++ tap) : 
*,. By abstraction, | T(f):¢—7--: > t *; 


a(R) times « 
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f = fi > fo; £1 < +- < Tm are the free variables of f, and zji < 
‘++ < Ejm; are the free variables of fj. Assume zit; € T such that 
zu. By Theorem 2.84, the f; are legal, and by Theorem 2.58, 
Bh (Oman 


Ajm; 
? JM; 


(due to Definition 4.16) 
: ajm. 

T (fj) = at (A AT (Ge?) Ep 
and that, by the induction hypothesis, 

FTF): T (52) >- >T (un) —%,. 
This means (Generation Lemma): 

aj1 5 ms 
De ee (Ee PE 


and therefore (Weakening): 


b; 
) ” for some b;. Observe that we can write 


BEE Vn LEN Bs 


and 
LT (tE), -Em T (to), x:Fı Fi, 


(notice that {£1,..., £m} 2 {T;1,---, jm; })- 
With rule (5, , +b) *max(b1,b2)) 


EI Weser) R AR > RF: *max(b1,b2)° 
Now notice that a = max(b1, b2), and use A-abstraction m times: 


FTE): T(t); 


. f =Va-u[f], 21 < +++ < Em are the free variables of f, and zi:t € 


T. For simplicity of notation, we will assume that x < zi. As zr € 
Fv(f’), we can write T(f’) = Az:T(wP).Azı:T(t}})... Arm: T(tem).F’. 
By Theorem 2.84 and Lemma 2.56, we have that f’ is legal in TU 
{z:uP}, and by Theorem 2.58 and Corollary 2.61: TU {x:u?} F f': 
(ub, t,...,t@m)”. By the induction hypothesis, 


FT): Ta’) + TG") + TE) Re. 
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Therefore (Generation Lemma), 
a:T(u?), 21:T(t*),. POR 33, 
so by the permutation lemma and Lemma 4.17 
ET), Em TE), Tu?) b F : xa- 


As T(u®) has either type O, or type *, (Lemma 4.14), and b < a, we 
can use rule (Os, *e,*a) or (*5, *a; *a) to derive 


2: T{t),... Emi TUE) H (Te: T(u).F’) : wa 


By A-abstraction, we find - T(f) : T(¢*); 


4. f = z(ky,...,kn), £1 < +++ < Tm are the free variables of f, and 
xiti ET. By Theorem 2.84, the k; are either legal pfs of predicative 
type in I’, or variables (so one of the x;s), or individuals (of type 0°). 
Let u be the type of k; in T. By Theorem 2.85: z:(u®!,...,u’n)@-1 € 
T. Using the induction hypothesis for the k; € P, we have that 

TT), am: T(t) HT(k;): T(u) 
for 1<j<n. We also have 
LT), ...,2miT (tom) H 2: Tu)... Tube) = tar 
Therefore, 
HE eos ye) A ACE) e (T(kn)) ran, 
hence (by the (Incl) rule) 
ETE) am TEE) H 2(P(k1)) + (Thin) € Fa: 


By A-abstraction,  T(f) : T(t®). 
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Remark 4.19 The use of the (Incl) rule in case 4 of the above proof is 
essential. There, it is shown that 


iT), em: T(E) H 2: Tu)... T (ub) > va, 


and one could try to form A-abstractions without using the (Incl) rule 
first. However, z € Fv(f), so at a certain point one has to construct a 
A-abstraction over z. Let t; be such that z:t, € I. The resulting term 
Az:T{t,).G has type T(t,) > ... — *a-ı. As z has order a — 1, T{t,) is 
a term of type Og_; (Lemma 4.14). One needs a rule (Og-1, Op, Op) with 
p > a—1 (as for p < a—1 such a rule is not present) for the construction of 
T(tz) — ... — *a-ı. Hence, one has to use the rule (Incl) to replace *.-ı 
by *„, which has type Oa. This makes it possible to use p = a. 


4b5 Logic in RTT and ARTT 


Before we can use ARTT as a system in which we can prove theorems, we 
must add some logical axioms to it. These axioms mainly have to do with 
the symbol L. For — and V the needed derivation rules are already provided 
by the type theory (via the PAT principle a la Curry-Howard). 


e The --introduction rule of natural deduction systems is already in- 
corporated in the translation of ~A to A — L. If we have a proof T 
of L under the assumption that x is a proof of A, then Ar: AT is a 
proof of A — L; 


e For the rule “ex falso sequitur quodlibet” the type system does not 
provide a natural equivalent. We therefore introduce an axiom 


ExFalso, : Ilf:x„.IIp: L.f£ 
for each ne Nt. We will store these axioms in some basic context 


Po. 


We remark that the type IIf: +, Ip: L.£ is indeed a type in ARTT. It 
is straightforward to derive that it is a type of sort *n4+1- 


We also remark that it is necessary to introduce separate axioms 
ExFalso,,ExFalso2,.... If we want to conclude the proposition f 
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using the ExFalso-axiom, we must provide the type of f, and in that 
type the order of f is also mentioned. This is a usual thing in ramified 
type systems, and such constructions occur also in Principia (cf. [121], 
pp. 41-43); 


e RTT is based on classical logic, and PAT on intuitionistic logic. There- 
fore we must add a “classical” axiom. We prefer to add the “law of 
double negation”, and introduce axioms 


DblNeg,, : If: IIp:(f— 1) L.f. 


It is easy to show that the type of this axiom is of sort *n+1. We store 
the axioms DblNeg,, in the same context Po. 


We compare the obtained system with the original logical system that 
was proposed in Principia Mathematica. That system is presented in what 
we would now call a natural deduction style. It has one derivation rule, 
modus ponens (cf. Principia, *1-1), and the following axioms: 


(pV p) > p 1-2); 

q— (p Vg) (1-3); 

(p Vq) > (q Vp) (1-4); 
pV(avr)) — (a V (p Vr) (#15); 
(q > r) — (@ Vg) > (pYr)) (1-6); 
f(x) > delf(2)l 9-1); 

f(x) v fly) — delf(2)] (9-11); 


In any assertion containing a free variable, this free 
variable may be turned into an apparent variable of 
which all possible values are asserted to satisfy the 
function in question 


(*9-13). 


The formulation of the last axiom is not as precise as the other ones. In 
later formulations of logic: (as the ones by Gödel [57] and Church [30]) we 
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see that the axioms for propositional logic are mostly maintained, but that 
the axioms on predicate logic are replaced by two other ones: 


Valf] > fla:=al 
valf v g} > f V Velg) 


where we assume that z ¢ FV(f) and that a does not contain any. variable 
that is bound in f at a place where z is free in f. These new axioms are 
theorems in the Principia (#9-2 and *9-25), and Russell's axioms are proved 
in Church’s system [30]. 

We must take into account that Gödel and Church both use simple type 


theory instead of ramified type theory. But Russell’s system, by accepting 
the axiom of reducibility, is in fact also based on simple type theory. 


Clearly, ARTT has also modus ponens (function application). We now 
show that all the axioms of Russell's system can be derived in ARTT as well: 


Theorem 4.20 In ARTT with the axioms ExFalso, and DblNeg,, one can 
construct terms of the following types: 


T(Vp:*«[(p V p) — pl) (*1-2); 
T(Yp:*kYq:*mla > (p V q))) (*1-3); 
T (Vp:#4V4:*m[(p V q) — (a V p)]) (*1-4); 


T(Vp:*.Va:4mVri*n[(p V(aVr)) > (av Pyr) (+15); 
TÒ(Yp*kYd:*mYr:*n[(q > r) > ((p Va) > (pVr))]) 16); 
TVE — *mYxu[f (£) > Az:e[f(z)]]) 9-1); 
T (VE: — +mVx:eVyiu[(f(x) V f(y) — dzulf(2)})) (+911). 
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Proor: The following terms are inhabitants of the types above: 
Ap:*,.Ax:(p — L) — p.Db1Neg,p(Ay:p — L.y(xy)) (*1-2); 
Ap:*k.Ag:km.AY:Q.AX:p > L.y (*1-3); 


Ap:*k-Agitm Ax:i(p — L) > q («1-4); 
Ay:q — L.Db1Neg, p(Az:p > L.y(xz)) ; 
Ap:*k- Agitm-ADi*¥n- Ax:(p — L) > ((q > L) >r). 


Ay:q — L.Àz:p — L.xzy 615); 
Api*k Àq *m AT! AKG — r. (x1:6); 
Ay:(p— L) — q.Az:p — L.x(yz) , 
AXi. Af: — *,.Ap:£x.Aq:(Ily:t.fy— L).qxp (*9-1); 
Axe. Aye. Afrit — *,.Ap:(fx — ı) > fy. (+9-11) 


Aa:Wz:u.fz — L.qy(p(qx)) 


& 


The last axiom of Russell’s logical system is implemented in ARTT by 
the Il-elimination rule. 

We conclude that the embedding T is sound with respect to the logics 
that are used in RTT and ARTT. 


4b6 Various implementations of PAT 


As was explained in Definition 4.8, types and terms are mixed up in one 
system ARTT. As a consequence we can spot a hierarchy of levels in ARTT. 
The hierarchies in RTT and ARTT are depicted in Figures 4 and 5. Some of 
the levels in these figures are empty. Later, we will compare the systems RTT 
and ARTT with other systems. For this comparison, we will draw similar 
pictures, in which we need the levels that are empty in the presentation of 
RTT and ARTT. 

We have the level of propositional types: This is formed by the terms 
that have a type of the form On. Below the level of propositional types we 
find propositions and propositional functions. These have a propositional 
type as type. Under the level of propositions and propositional functions 
we find the level of proofs: A proof has always a proposition as its type. 
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Ramified Types 
0? (0°)" 


Propositions Prop. functions 
a Vx:()*[x(Q) > x()] R(x) 


Figure 4: Levels within RTT 


Topsorts 
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Prop. functions 
Axi. Rx 


Propositions 
TIx:«, Iy:x.x 


Figure 5: Levels within ARTT 
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There is a second hierarchy of levels in ARTT. At its top we find «,, 
the type of individual types. It has as its only inhabitant 4, the type of 
individuals. One could imagine situations in which *, has more inhabitants. 
For instance, if we would allow several sets of individuals. Or if the I- 
formation rule (*5,*s,*;) is allowed, so that also types like 1 — ı can 
be constructed. Below the type of individuals, we find the individuals 
themselves. 

We see that the transformation of RTT to ARTT has introduced some 
new term levels: 


e A level of topsorts. These topsorts are needed to type the ramified 
types. Such typing is needed for two reasons: 


Variable introduction If we want to introduce a variable of a cer- 
tain type r in a PTS, we have to establish that 7 is an allowed 
type. This is done by requiring that 7 itself must have a certain 
type; 

Type construction To control the construction of types with the T- 
formation rules, II-formation is only allowed with certain types. 
This is determined by the type of that type. The ramified types 
of RTT however, do not have a type in RTT. We use the topsorts 
0,,02,... to type the translations of these ramified types in 
ARTT. This also gives us a good way to check the order of a 
type: A type of order n has type On; 


e A level of proofs.This level was empty in RTT, as proofs are not part 
of the theory of RTT. 


From our PTS-point of view it is remarkable that sorts of RTT which are 
denoted by the symbol x, are not all at the same level. The sort x, occurs 
at the level of topsorts, while the sorts *, live at the level of the ramified 
types. Moreover, we have already seen that each II-formation rule of the 
form (On, $1, $2) also has a variant of the form (*s, 51,52), and vice versa. 
With this in mind it would have been more clear to write O, instead of *,, 
and use *, for t. 

The reason that we chose the symbol *, (instead of Os) has to do with 

traditions within the discipline of Pure Type Systems: The levels are 
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Object Types Propositions Prop. functions 
IIx:*, .[y:x.x Ax:t.RX 


Objects Proofs 


AX:*1.AYIX.y 


Figure 6: Levels of ARTT in PTS tradition 
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true(Vx:()!.x—x) (Q? (0°)* 
Objects Proofs Propositions | Prop. functions 


Figure 7: Levels of ARTT in bool-style PAT 
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usually partitioned in such a manner that x, and *, live at the same level. 
See Figure 6. 

Let’s have a closer look at the traditional situation. The PAT principle 
within PTSs is often implemented by lifting the propositions and proposi- 
tional functions from term level to type level, but leaving the individuals 
at term level. We see that proofs are not introduced at a new level below 
term level, but that the type level (as far as propositions and propositional 
functions are concerned) is lifted, and that the proofs are put at the term 
level that was originally occupied by the propositions. 

The treatment of propositions at a higher level than individuals can 
be understood if we take a look at first-order logic. In systems for first- 
order logic, quantification over individuals is possible, but quantification 
over propositions and propositional functions is not allowed. This leads to 
the treatment of propositions and propositional functions at a higher level. 

Contrary to the PTS tradition, we did not lift propositions from term 
level to type level when we constructed a PAT implementation for RTT. 
Instead, we built a new level below the level of propositional functions, 
propositions and individuals: The level of proofs. In this way the double 
role of propositions is more clear: 


e They are terms, as they live at the same level as the individuals; 
e They are types, as they can have inhabitants (their proofs). 


The PAT implementation a la De Bruijn can in various ways be seen as 
a compromise between the two different points of view above (though it 
has been developed independently). A PAT implementation of RTT à la De 
Bruijn could be depicted as in Figure 7. There are three hierarchies now: 
The two well-known hierarchies of objects and propositions/propositional 
functions, plus a new hierarchy for proofs. The hierarchies of propositional 
functions’and proofs are connected via the operator true (see Section 4a4), 
which assigns a type of proofs to each proposition. This picture: 


° Respects the wish to treat propositions at term level; 
e Respects the wish to treat proof classes as types; 


e Can also be seen, in retrospect, as a compromise from a historical 
point of view. Though AUTOMATH and the PAT notion in De Bruijn 
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style are mainly independent of other developments in logic and type 
theory, the bool-style PAT notion (1968) historically fits between the 
style of Figure 5 for the Ramified Theory of Types (1908-1912) and 
the style of Pure Type Systems in Figure 6 (1988). 


Ac STT in PAT style 


From the description of RTT in PAT style it is easy to make a description 
ASTT of STT in PAT style: Simply remove all references to orders. This 
means that +, has to be replaced by *, and O, by O (for all n € N). 
In fact, the same procedure is followed by Ramsey [101], Gödel [57] and 
Church [30] in their presentations of simple type theory. 

One of the consequences is that rule (Incl) disappears. We obtain a pure 
type system with axiom *:U and rules (*,,0, O), (0, 0, O), (#5, *, *), (*, *, *) 
and (O, *, *). This looks familiar to the Calculus of Constructions AC. In AC 
there is the same axiom *:0, and rules (+, *, x), (+, 0,0), (0, *, x), (0, 0, 0). 
But ASTT is more restricted than AC. More specifically, rule (*,,0,D) is 
not as powerful as rule (x, 0,0) in AC. As in AC, we do have higher order 
logic, but we do not have the higher order functions that are present in AC. 
This is due to the fact that +, is a topsort in ASTT, while x has type DO. 
Therefore, ASTT is a system somewhere in between Aw and AC. 

Notice that we have given a PAT version of the simple theory of types, 
and not of the simply typed A-calculus of Church [30]. In [30] there are 
more things formalised than in STT: 


e In the simply typed A-calculus there are more types. For instance, 
t > vis a type, and so is * — ı (in [30] this type is denoted o — +). 
More precise: For a PAT implementation of Church’s theory we should 
add a rule (4, *s,*s), and a rule (O, *,, *.); 


e Church has an additional logical operator (7) in his system. This 
operator also occurs in Russell’s RTT, but only as an abbreviation 
and not as a new syntactical object (see [121], pp. 66-68 and pp. 
173-175). 


Remark 4.21 Together, the rules (*,,0,0) and (0,0, 0) form a version 
of the simply typed lambda calculus of Church. The identification would 
have been complete if we had identified +, with O. 
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Conclusions 


We saw that there are various ways in which PAT can be implemented in 
type theories. There are two main streams: 


Curry-Howard approach: This approach treats propositions as types, 
and a proof of a proposition is an inhabitant of the type that repre- 
sents that proposition. The implementation is based on the Brouwer- 
Heyting-Kolmogorov interpretation of the logical connectives. In par- 
ticular, a proof of an implication A — B is represented as a function 
that transforms proofs of the proposition A (terms of the type A) to 
proofs of B (terms of type B); 


De Bruijn approach: For each proposition P we create a type bool(P). 
A proof of P in this approach is not a term of type P (as in the 
Curry-Howard style), but a term of type bool(P). 


In Curry-Howard style implementations, logic is already part of the sys- 
tem. The logical connective — and the quantifier V immediately translate 
to the construction of function types. Using higher-order logic, other logical 
connectives can be defined in terms of — and V. De Bruijn style imple- 
mentations have more possibilities. One can implement the logical system 
independent from the type system. But it is also possible to use function 
types for the translation of — and/or V as is done in the Curry-Howard 
style. 

The various implementations lead to various levels in type systems. 
This was depicted in figures 6 and 7. In Curry-Howard style and the PTS 
tradition, propositions are at the same level as types, and therefore, proofs 
are at the same level as objects (terms). In De Bruijn style (bool-style), 
proofs, propositions, and propositional functions all live at term level. 

A third division into levels appeared when we gave a description of RTT 
in PAT-style. See Figure 5. On the one hand, ARTT uses a Curry-Howard 
style implementation. There is no difference between a proposition and 
the type of its proofs. Therefore, proofs and propositions do not live at 
the same level (as is the case in PAT a la De Bruijn). On the other hand, 
objects and propositions live at the same level. 

The implementation of RTT in PAT-style not only serves as an elaborate 
example of PAT, but also shows that ramified types can be placed in the 
framework of Pure Type Systems without too many problems. 


Chapter 5 


Automath 


The first practical use of the propositions-as-types principle sketched in 
Chapter 4 is found in the AUTOMATH project [95]. The AUTOMATH sys- 
tems are the first examples of proof checkers, and in this way they are 
predecessors of modern proof checkers like Coq [42] and Nuprl [34]. 

The project was started in 1967 by N.G. de Bruijn, and 


“it was not just meant as a technical system for verification of 
mathematical texts, it was rather a life style with its attitudes 
towards understanding, developing and teaching mathematics.” 


({23]; see [95] p. 201) 


Thus, the roots of AUTOMATH are not to be found in logic or type 
theory, but in mathematics and the mathematical vernacular [22]. This is 
also clearly reflected in the goals of the AUTOMATH project: 


“1. The system should be able to verify entire mathematical 
theories. 


2. The system should remain very general, tied as little as pos- 
sible to any particular set of rules for logic and foundations 
of mathematics. Such basic rules should preferably belong 
to material that can be presented for verification, on the 
same level with things like mathematical axioms that have 
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to be explained to the reader! 


3. The way mathematical material is to be presented to the 
system should correspond to the usual way we write mathe- 
matics. The only things to be added should be details that 
are usually omitted in standard mathematics.” 


([23]; see [95] pp. 209-210) 


Goal 1 was definitely achieved: Van Benthem Jutting translated and 
verified Landau’s “Grundlagen der Analysis” [83] in AUTOMATH (see [9], 
[10)) and Zucker formalised classical real analysis in AUTOMATH (see [124]). 

A consequence of goal 2 has already been discussed in Section 4a4. 
There, we saw that de Bruijn used a PAT principle that was somewhat 
different from Curry and Howard’s. Curry and Howard identified the log- 
ical implication and the universal quantifier with function types, following 
Heyting’s intuitionistic interpretation of logical connectives. In doing so, 
they do not leave a possibility for a different interpretation of implication 
and universal quantification. Using PAT in de Bruijn’s style, the rules for 
manipulating the logical connectives always have to be made explicit by the 
user (an example of such a specification can be found in Section 12 and 13 
of {11}). This makes it possible to give interpretations of logical connectives 
that are not based on interpreting implication and universal quantification 
by a function type. 

De Bruijn has spent a lot of effort in achieving goal 3. He has studied the 
language of mathematics in great depth (see [22]), and many of his insights 
are reflected in AUTOMATH. We mention some AUTOMATH features that 
help to achieve goal 3: 


e The use of books. Just like a mathematical text, AUTOMATH is writ- 
ten line by line, where each line may refer to definitions or results 
given in earlier lines; 


e The use of definitions. Without definitions, expressions very soon 
become too long. Moreover, a definition gives a name to a certain 


1So: the logical rules should be treated on the same level as mathematics. A logical 
rule can be introduced as an axiom in the same way a mathematical axiom can be 
introduced. Other logical rules can be derived from existing rules, like mathematical 
theorems can be derived from existing theorems and axioms. [remark by the author] 
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expression, and this name makes it easier for the user to remember 
(or understand) what the use of the definiens is; 


e The use of a parameter mechanism together with a default mech- 
anism. We discuss the advantages of these mechanisms in Section 
5a. 


As AUTOMATH was developed quite independently from other develop- 
ments in the world of type theory and A-calculus, there are many things 
to be explained in the relation between the various AUTOMATH languages 
and other type theories. In this chapter we focus on the relation between 
AUTOMATH and Pure Type Systems (PTSs). Both [5] and [54] mention this 
relation in a few lines, but as far as we know a satisfactory explanation of 
the relation between AUTOMATH and PTSs is not available. Moreover, both 
works consider AUTOMATH without one of its most important mechanisms: 
The definition system. Even the system PAL, which roughly consists of the 
definition system of AUTOMATH only, is able to express some simple math- 
ematical reasoning (see for instance Section 5 of [21]). Moreover, recent 
developments on the use of definitions in Pure Type Systems by Bloo, Ka- 
mareddine and Nederpelt [17, 16] and Severi and Poll [114] justify renewed 
research on the relation between AUTOMATH and PTSs. The combination 
of the work of Severi and Poll [114] and the parameter mechanism of AU- 
TOMATH leads to a white spot in the theory of PTSs, and this spot will be 
filled up in Chapter 6. 

In Section 5a we give a description of AUT-68, which is one of the 
most elementary AUTOMATH system. In Section 5b we discuss how we can 
transform AUT-68 into a PTS. In doing so, we must notice that AUT-68 
has some properties that are not usual for PTSs: 


e AUT-68 has 7-reduction; 


o AUT-68 has Il-application and II-reduction (as it does not make any 
difference between A and II); 


e AUT-68 has a definition system; 
e AUT-68 has a parameter mechanism. 


n-reduction is the reduction relation generated by (Ax.Rx) —„ R, where 
x ¢FV(R). In systems with II-application, a term IIx:A.B can be applied 
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to a term N (of type A). This results in (IIx:A.B)N. The usual application 
rule of Pure Type Systems then changes to 


THM:TIlx:A.B TEN:A 
TE MN: (r:A.B)N 


In such systems, TI behaves like A, and as a consequence, there also is a 


rule of Il-reduction 
(Ir: A.B)N —n Bla:=N). 


In AUTOMATH, one does not even make any distinction between the terms 
IIx:A.B and Ax:A.B. They are both denoted |x:A]B. It is not always easy 
to see whether a term [z:A]B represents (in notation of PTSs) Ax:A.B or 
Tr: A.B. 

We pay more attention to Il-application and H-reduction at the end of 
this Chapter; for more details see [17] and the literature on AUTOMATH 
[95]. 

We consider 7-reduction not as one of the essential features of AU- 
TOMATH, and prefer to focus on the definition and parameter mechanisms, 
which are the most characteristic type-theoretical features of AUTOMATH. 

In Section 5c, we present a system A68 that is (almost) a PTS. We show 
that it has the usual properties of PTSs and we prove that A68 can be seen 
as AUT-68 without n-reduction, H-application and Il-reduction. There is 
no direct parameter system in A68 either, but this parameter system is 
hidden in the rules for the construction of product types. In Section 5d we 
compare the definition system of AUT-68 with several other, more modern, 
type systems with definitions. 


5a Description of AUTOMATH 


During the AUTOMATH-project, several AUTOMATH-languages have been 
developed. They all have two mechanisms for describing mathematics. One 
of them essentially is a typed A-calculus, with the important features of À- 
abstraction, A-application and g-reduction. The other mechanism is the use 
of definitions and parameters. The latter is the same for most AUTOMATH- 
systems, and the difference between the various systems is mainly caused by 
different A-calculi that are included. In this section we will describe the sys- 
tem AUT-68 which not only is one of the first AUTOMATH-systems, but also 
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a system with a relatively simple typed A-calculus, which makes it easier 
to focus on the (less known) mechanism for definitions and parameters. 

A more extensive description of AUT-68 on which our description below 
is based, can be found in [11], [20] or [40]. 


5al Books, lines and expressions 


In the conception underlying the AUTOMATH-systems, a mathematical text 
is thought of as being a series of consecutive “clauses”. Each clause is 
expressed in AUTOMATH as a line. Lines are stored in so-called books. For 
writing lines and books in AUT-68 we need 


e The symbol type; 

e A set V of variables; 

e A set C of constants; 

e Thesymbols( ) [ ] : — , 


We assume that V and C are infinite, or at least offer us as many different 
elements as needed. We also assume that VNC = Ø and that type ¢ VUC. 

The elements of V are called block openers, the elements of V UC are 
called identifiers in [21]. 


Definition 5.1 (Expressions) We define the set E of AUT-68-erpressions 
(or, in short, expressions) inductively: 


(variable) If x € V then z € £; 


(parameter) If a € C, n € N (n = 0 is allowed) and Dı,...,In E€ € 
then a(Xj,...,Un) € E. %ı,... ,&n are called the parameters of 
a(dı, Eyt An); 


(abstraction) If x € V, X € E U {type} and Q € E then [r:I]N € £; 


(application) If 21, ©% € E then (59) € E. 


Sometimes we will consider the set €+ @ eu {type}. 
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Remark 5.2 The AUT-68-expression [z:D]Q is AUTOMATH-notation for 
abstraction terms. In PTS-notation one would write either Az:5.Q or 
IIz:5.0. In a relatively simple AUTOMATH-system like AUT-68, it is easy 
to determine whether Ar:%.N or IIx:D.N is the correct interpretation for 
[2:2]0. This is harder in AUTOMATH-systems with a more complex à- 
calculus, like AUT-QE. 


Remark 5.3 The AUT-68-expression (%2)21 is AUTOMATH-notation for 
the intended application of the “function” ©; to the “argument” Xs. In 
PTS-notation: 1%. 

Note the unusual order of “function” © and “argument” Xy in (52). 
An advantage ofthis notation with respect to the classical notation becomes 
clear if we assume that SD} is a function [z:Q1]Q2. In that case, (£2) Xi = 
(Eo) [£:NQ1] N2. The argument Ep and the abstraction [x:Nı] belong together: 
As soon as the intended application of the function © 1 to its argument is 
carried out, 52 is substituted for x everywhere in Qs. It is convenient 
to put expressions that belong together next to each other. In the usual, 
classical notation, we would write ([2:02,]02)%2, where £p and [z:Qj] are 
separated from each other by the expression Q2. This makes the structure 
of the expression less clear, in particular if Q% is a very long expression. The 
advantages of writing (%2)%ı instead of the classical E£% are extensively 
discussed in the works of Kamareddine and Nederpelt — see for instance 
[94], [72], (73). 


Definition 5.4 (Free variables) 
e rv(z) = {2}; 


e Fv(allı,... ,En)) SUL, FV(E)); 
e rv({r:2}0) E rv(E) U (Fv(Q) \ {2}; 
e FV((S2)Ey) # Fv(£1) U FV(D2). 


Convention 5.5 We adhere to the usual convention that names of bound 
variables in an expression differ from the free variables in that expression. 

We use = to denote syntactical equivalence (up to renaming of bound 
variables) on expressions. 
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Definition 5.6 If Q,X1,... , En are expressions (in €), and zi,... , 2n are 
distinct variables, then 


Nleı,-- . AR ee ove Sl 


denotes the expression Q in which all free occurrences of x1,... , 2n have 
simultaneously been replaced by D1,... ‚Xn. This, again, is an expression 
in E (this can be proved by induction on the structure of Q). 

typelzı,... ‚In:=D1,... , Un] is defined as type. 


Definition 5.7 (Books and lines) An AUT-68-book (or book if no confu- 
sion arises) is a finite list (possibly empty) of (AUT-68)-lines (to be defined 
next). If lj, ‚ln are the lines of book B, we write B=1,,... ‚In- 

An AUT-68-line (or line if no confusion arises) is a 4-tuple (I; k; £1; 22). 
Here, 


e T is a context, Le. a finite (possibly empty) list z1:@1,... ‚In:Qn, 
where the 2;s are different elements of V and the a;s are elements of 
ET: 


e kis an element of VUC; 
e ©; can be (only): 


o The symbol — (if k € V); 
o The symbol PN (if k € C) (PN stands for “primitive notion”); 
o An element of € (if k € C); 


e SY» is an element of EF. 


Remark 5.8 As regards the intended meaning of an AUTOMATH-line, we 
note the following. There are three sorts of lines: 


1. (T; k;—; X2) with k € V. This is a variable declaration of the variable 
k having type U2. This does not really add a new statement to the 
book, but these declarations are needed to form contexts. 


Variables can play two roles. First of all they can represent an un- 
specified object of a certain type (compare this to the mathematical 
way of speaking: “let x be a natural number”). Secondly, a variable 
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can act as a logical assumption. This happens if the variable has as 
type the proof of a certain proposition A. The usual mathematical 
way of speaking in such a situation is not “let z be a proof of A”, 
but: “assume A”; 


2. (T; k; PN; X22) with k € C. This line introduces a primitive notion: A 
constant k of type Es. This constant can act as a primitive notion 
(for instance introducing the type of natural numbers, or introducing 
the number 0), or as an axiom (to be precise, a postulated inhabitant 
of the set of proofs of the proposition expressing the axiom). 


The introduction of k is parametrised by the context I. For instance, 
if we want to introduce the primitive notion of “logical conjunction”, 
we do not want to have a separate primitive notion for each possi- 
ble conjunction and(A, B).? Instead, we want to have one primitive 
notion and, to which we can add two propositions A and B as param- 
eters when we want to form the proposition and(A, B). Therefore, we 
introduce and in a context T' = x:prop, y:prop. Given certain propo- 
sitions A, B this enables us to form the AUT-68-expression and(A, B); 


3. (T; k; £1; 2) with k € C and E; € E. This line introduces a defini- 
tion. The definiendum k is defined by the definiens 1 and has type 
Ya. Definitions can be parametrised in a similar way as primitive 
definitions. Definitions have two important applications: 


e They make it possible to abbreviate long expressions, thus keep- 
ing the structure of a book clear, and making manipulations with 
expressions more efficient; 


e They make it possible to give a name to an expression. For 
instance, we can abbreviate S(S(S(S(S(S(S(0))))))) by 7. 


Example 5.9 In Figure 8 we give an example of an AUTOMATH-book that 
introduces some elementary notions of propositional logic. We have num- 
bered each line in the example, and use these line numbers for reference in 


?Contrary to the habit in mathematics to use only one character (possibly indexed) 
for a variable, AUTOMATH adopts the convention of computer science to use variables 
that consist of more than one character. So and represents only one variable, and not 
the application of a to n and d. 
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our comments below. To keep things clear, we have omitted the types of 
the variables in the context. The book consists of three parts: 


e In lines 1-5 we introduce some basic material: 


1. 


We take the type prop as a primitive notion. This type can be 
interpreted as the type of propositions; 


. We declare a variable x of type prop. This variable will be wed 


in the sequel of the book; 


. We similarly define a variable y of type prop. We do this within 


the context x:prop. For reasons of space, we do not explicitly 
mention the type of x in the context; if necessary we can find 
that type in line 2; 


. Given propositions x and y, we introduce a new primitive notion, 


the conjunction and(x,y) of x and y; 


. Given a proposition x we introduce the type proof (x) of the 


proofs of x as a primitive notion. In this way, we can use the 
PAT principle à la de Bruijn (cf. Section 4a4); 


e In lines 6-11 we show how we can construct proofs of propositions of 
the form and(x, y), and how we can use proofs of such propositions: 


6. 


Given propositions x and y, we assume that we have a expres- 
sion px € V of type proof (x). In other words, the variable px 
represents an arbitrary proof of the proposition x; 


. We also assume a proof py of y; 


8. Given the propositions x and y, and proofs px and py of x and 


y, we want, to conclude that and(x,y) holds. This is an ax- 
iom of natural deduction, and we call this axiom and-I (and- 
introduction) in our book. An expression and-I(x,y,px, py) is 
a proof of and(x,y), so of type proof (and(x,y)). 

In line 8, we see proof (and) instead of proof (and(x,y)) as the 
type of and-I. This is usual notation in AUTOMATH, and keeps 
lines short. To be precise, this “default mechanism” works as 
follows. From line 4, we conclude that and should always carry 
two parameters. This is because the context of line 4 has two 


YOoq-HLVNWOLAV ue jo adurexay :g aın3ıg 


Ø| prop | PN type 
@ x = prop 
x y J prop 
X,Y and PN prop 
x | proof | PN type 
x,y px — proof (x) 
X,J,PX | py | — proof (y) 
X,y,px,py | and-I | PN proof (and) 
x,y pxy — proof (and) 
x,y,pxy | and-O1 | PN proof (x) 
x,y,pxy | and-02 | PN proof(y) 
x | prx | — proof (x) 
x,prx | and-R | and-I(x,x,prx,prx) proof (and(x,x)) 


x,y,pxy | and-S | and-I(y,x,and-02,and-01) | proof(and(y,x)) 
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variables x and y. In the expression proof (and) in line 8, no 
parameters are provided for and. It is then implicitly assumed 
that the first two variables of the context of line 8 are used as 
“default parameters”. The first two variables of the context of 
line 8 are x and y. Therefore, proof (and) in line 8 should be 
read as proof(and(x,y)). 

In a similar way, we could write proof instead of proof (x) in line 
6. From line 5 (where proof is introduced) we find that proof 
carries one parameter. Writing just proof in line 6 means that 
we must use the first variable of the context of line 6, x, as a 
default parameter. We must write proof (y) in line 7. Writing 
just proof would give proof (x); 


. We also want to express how we can use a proof of and(x,y). 


Therefore we introduce a variable pxy that represents an arbi- 
trary proof of and(x,y); 


First of all, we want that x holds whenever and(x,y) holds. 
Therefore we introduce an axiom and-01 (and-out, first and- 
elimination). Given propositions x,y and a proof pxy of the 
proposition and(x,y), and-01(x,y,pxy) is a proof of x; 


Similarly, we introduce an axiom and-02 that represents a proof 
of y; 


e We can now derive some elementary theorems: 


12. 


13. 


We want to prove that we can derive and(x,x) from x. That 
is: Whenever we have a proof of x, we can construct a proof 
of and(x,x). In line 6, we already introduced a variable for a 
proof of x: px. However, we declared this variable in the context 
x,y. As we do not want a second proposition y to occur in this 
theorem, we declare a new proof variable prx, in the context x; 


We derive our first small theorem: The reflexivity of the logical 
conjunction. Given a proposition x, and a proof prx of x, we 
can use the axiom and-I to find a proof of and(x,x): we can 
use the expression and-I(x,x,px,px) thanks to line 8. We give 
a name to this proof: and-R. If, anywhere in the sequel of the 
book, X is a proposition, and Q is a proof of ©, we can write 
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and-R(X,Q) for a proof of and(£, £). This is shorter, and more 
expressive, than the original expression and-I(D, ©, 2,9); 


14. We can also show that and is symmetric. That is: Whenever 
and(x,y) holds, we also have and(y,x). The idea is as fol- 
lows. Given propositions x,y and a proof pxy of and(x,y), we 
can form proofs and-01(x,y,pxy) of x and and-02(x,y,pxy) 
of y. We can feed these proofs “in reverse order” to the axiom 
and-I: The expression and-I(y,x,and-02,and-01) represents 
a proof of and(y,x}. The expression and-02 should be read as 
and-02(x,y,pxy) due to the “default parameter” mechanism. 
Similarly, and-01 must be read as and-01(x,y,pxy). 


5a2 Correct books 


Not all books are good books. If (T; k; £1; E2) is a line of a book SB, the 
expressions ÈX; and Xz (as long as ©; is not PN or —, and Ex is not type) 
must be well-defined, i.e. the elements of VUC occurring in them must have 
been established (as variables, primitive notions, or defined constants) in 
previous parts of B. The same holds for the type assignments z;:a; that 
occur in I’. Moreover, if ©; is not PN or —, then ©; must be of the same 
type as k, hence ©; must be of type 2 (within the context T). Finally, 
there should be only one definition of any object in a book, so k should not 
occur in the preceding lines of the book. 

Hence we need notions of correctness (with respect to a book and/or a 
context) and we need a definition of the notion “D, is of type Ez” (within 
a book and a context). 

We write B; @ + OK to indicate that a book % is correct, and B; T F OK 
to indicate that the context T is correct with respect to the (correct) book 
B. As the empty context will be correct with respect to any correct book, 
this does not lead to misunderstandings. 

We write B;T E- E (or BT Fayr_eg È if confusion with other deriva- 
tion systems might arise) to indicate that D is a correct expression with 
respect to B and T. We write B;T HD, : X (or B;TFAUT-es Li : X2) to 
indicate that X; is a correct expression of type Sz with respect to B and 
T. We also say: D; : Da is a correct statement with respect to B and T. 

The following two interrelated definitions are based on [40]. 
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Definition 5.10 (Correct books and contexts) A book B and a con- 
text T are correct ifB;T + OK can be derived with the following rules (The 
relation =ga (“definitional equality”) will be explained in Section 5a3. The 
rules use the notion of correct statement as given in Definition 5.11). 


(axiom) Ø; ØF OK 
Bı (05 25; a), Bas F OK _ 
(context ext.) By, l; z: — a), B7. l, ra F OK 
B:T F OK 


(book ext.: vart) B, (Dz; —; type): Ø F OK 


B:T F Xz: type 
B, (T; 2;—; £2); Ø F OK 
Bok 
B, (T; k; PN; type); Ø F OK 
: B;T F Deg: type 
(book ext.: pn2) B, l; k; PN; 8p); Ø F OK 


f B;T E X : type 
(book ext.: defl) B, T; k; Er type); Ø F OK 


(book ext.: var2) 


(book ext.: pn1) 


(book ext.: def2) 


B;T F Xa: type BTE X: 55 B;T F De =ga U5 
B, (T; k; X1; 22); Ø F OK 


For the (book ext.) rules, we assume that the introduced identifiers x € V 
and k € C do not occur anywhere in B and T. 


Definition 5.11 (Correct statements) A statement 8;[ H E: 2 is 
correct if it can be derived with the rules below (the start rule uses the 
notions of correct context and correct book as given in Definition 5.10). 


BT, x:a, I's F OK 
2 2 2 
(start) B; 11, z:a, l2 F r:a 


(parameters) 
B = By, (@1:04,... , nian; b; Q1; R2), Bo 
B: TH Liaz, ... au n „il = 1, one ‚n) 
B;T f bies 


(abstr.1) 
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(abstr. 2) 


(application) 


(conversion) 
B;T FE: Qı B;TH Na:type B;T FQ, =pd Qa 
a X: 2 


When using the parameter rule, we assume that B; T F OK, even ifn = 0. 


Example 5.12 The book of Example 5.9 (see Figure 8) is correct. We 
prove this line by line for the first four lines (the reader is invited to check 
lines 5-14 for himself). We write (m-n) to denote the book that consists 
of lines m to n of Example 5.9. 


1. By (axiom), Ø; Ø H OK, so by (book ext.: pnl), 


(Ø; prop; PN; type); Ø F OK. 


2. By (parameters), (1-1); Ø + prop : type. Therefore by (book ext.: 
varl), (1-1), (Ø, x, —, prop); @ F OK. 


3. By (context ext.), (1-2); x:prop + OK. Therefore by (book ext.: varl), 
(1-2), (x:prop; y; —; prop) F OK. 


4. By two applications of (context ext.), (1-3); x:prop, y:prop + OK. By 
(parameters), (1-3); x:prop, y:prop H prop:type. Therefore by (book 
ext.: pn2), (1-4); 2 F OK. 


5a3 Definitional equality 


We still need to describe the relation =gq (“definitional equality”). This 
notion is based on both the definition mechanism and the abstraction /ap- 
plication mechanism of AUT-68. The abstraction/application mechanism 
provides the well-known notion of G-equality, originating from the rule of 
B-conversion: 


le: >p Mle=2). 
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We will use notations like —g, =g, and 7 as usual (see A.11). 

We now describe the definition mechanism of AUT-68 via the notion of 
d-equality. This definition depends on the definition of derivability, and the 
definition of derivability given in the previous subsection depends on the 
definition of definitional equality. In fact, the definitions of correct book, 
correct line, correct context, correct expression and definitional equality 
should be given within one definition, using induction on the length of the 
book. This would lead to a correct but very long definition, and that is 
probably the reason why the definitions are split into smaller parts (in this 
thesis as well as in [40)). 


Definition 5.13 (d-equality) Assume, 8; TH%:%. We define the d- 
normal form nfy(X) of & with respect to B by induction on the length of 
the book 8. So, assume nfa(%) has already been defined for all books 8’ 
with less lines than %, and all expressions © that are correct with respect 
to B’ and a context T. Use induction on the structure of X: 
e If È is a variable z, then nfy(X) def T 
e Now assume © = b(Nı,... , Qn), and assume that the normal forms 
of the Q;s have already been defined. 
Determine a line (A;b;=);=2) in the book ® (there is exactly one 
such line, and this line is determined by b). 


Write A = r1:a1,..- ,En:@n- Distinguish: 
o =, =—. This case doesn’t occur, as b € C; 


o =, = PN. Then define nfy() der b(nfg(Q1),... , nfg(Qn)); 


-_ 


o Ei is an expression. Then Zj is correct with respect to a book 
®B' that contains less lines than B (B’ doesn’t contain the line 
(A;b;Z1; 22), and all lines of 8’ are also lines of B), hence we 
can assume that nfa(Z1) has already been defined. Now define 


nfa(2) E nfa(Ei) [zi ens=nfa()… , nfy(Qn)]; 
e If E = [z:Q1]Q2 then nfı(2) € [z:nfa(Q1)]nfa (Q2); 


e If E = (MQ2)M then nfy(Z) € (nfy(M2))nfy(M1). 
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We write ©] =a Da when nfy(Dı) = nfy(X2). 


As we see, the d-normal form nfy(=) of a correct expression © depends on 
the book %, and in order to be completely correct we should write nfag (®©) 
instead of only nfa(%). We will, however, omit the subscript ® as long as 
no confusion arises. 

We write =gq for the smallest equivalence relation that contains both 
=g and =. 


Definition 5.14 (Definitional equality) © and Xz are called defini- 
tionally equal (with respect to a book B) if X1 =gq Da. 


This definition completes the description of AUT-68. Again, definitional 
equality of expressions ©; and X depends on the book %, so we should 
write =gag instead of =gq. Also in this case we leave out the subscript B 
as long as no confusion arises. 

As an alternative to Definition 5.13, we describe the notion of d-equality 
via a reduction relation. 


Definition 5.15 (ö-reduction) Let 8 be a book, T a correct context with 
respect to B, and ÈX a correct expression with respect to B;T. We define 
250 by the usual compatibility rules, and 


(6) EFE =b(X),... , En), and B contains a line (zı:a1,... ‚Xn:an;b; 21; 22) 
where Zi € E, then 


Da rer]: 


We say that X is in ö-normal form if for no expression Q, E —; 2, and use 
notations like —6, —»t and =; as usual. —; depends on %, but as we did 
before with nfa and =g, we only explicitly mention this if it is not clear in 
relation to which book B —s is considered. 


The relations =q and =; are the same: 
Lemma 5.16 


1. (Church-Rosser) If A, =s Ag then there is B such that A; > B and 
Ag 6 B; 
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2. nfa(E) is the (unique) -normal form of X; 
3. D =; Q if and only if 5 =4 Q. 


PROOF: AUT-68 with —s can be seen as an orthogonal term rewrite system 
(see [75]). 


1. Such a term rewrite system has the Church-Rosser property (see [75]); 

2. It is not hard to show that © —s nfa(%). By induction on the def- 
inition of nfy() one shows that nfa(&) is in é-normal form. The 
uniqueness of this normal form follows from the Church-Rosser prop- 
erty; 

3. If E =; Q then by (1) there is Y such that E —; W and Q —; Y. 
This means that the ö-normal forms of © and Q are equal, so by (2), 
nfy(X) = nfy(Q). 

On the other hand, if nfy(©) = nfy(Q), then © and N have the same 
6-normal forms (by (2)), so E =s Q. 


X 
Lemma 5.17 The relation —s is strongly normalising. 


PROOF: We already know that —; is weakly normalising (by 2). Moreover, 
the definition of nfy(©) in 5.13 induces an innermost reduction strategy. 
By a theorem of O’Donnell R [96], or pp. 75-76 of [75]), —s is strongly 
normalising. È 


5a4 Some elementary properties 


Although we do not want to give a complete overview of all the meta- 
theoretical properties of AUTOMATH (these are studied in [91] and [40]), we 
do present some properties that we will need at a later stage. 


Definition 5.18 A book ® is part of another book 8’, denoted as B C B’, 
if all lines of B are lines of ®’ as well. Similarly, a context T is part 
of another context I’, notation T C I”, if all declarations z:a of T are 
declarations in T” as well. 
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Lemma 5.19 (Weakening for AUT-68) Jf B+ 5:2, BC 8’, 
PCI’ and B'I H ok then B';T' FE: 2. 


PROOF: By induction on the derivation of B;T 5:00. R 


5b From AUT-68 towards a PTS 


We want to give a description of AUT-68 within the framework of the Pure 
Type Systems. There are several ways to do this. One of the most im- 
portant choices to be made is whether or not to maintain the parameter 
mechanism (that is: To allow expressions with parameters, as in the sec- 
ond clause of Definition 5.1). On the one hand, the parameter mechanism 
is an important feature of AUTOMATH. On the other hand PTSs do not 
have a parameter mechanism, and the parameter mechanism can be easily 
imitated by function application (cf. the second clause of the forthcoming 
Definition 5.20). Moreover, the description by van Benthem Jutting in [5] of 
the systems AUT-68 and AUT-QE in a PTS style does not use parameters. 

In this chapter, we provide a translation to PTSs without parameters. 
In doing so, we can explain van Benthem Jutting’s description of AUT-68 
and AUT-QE. 

We will see, however, that the way in which we must handle parameters 
in the resulting PTS is a bit artificial. Moreover, we think that parameters 
play an important role in the AUTOMATH systems, and that they could 
play a similar role in other PTSs. Therefore, we will present an extension 
of PTSs with parameters in Chapter 6. This extension is based on the way 
in which parameters are handled in AUTOMATH, and it will be shown that 
AUTOMATH can be described very well within these PTSs with parameters. 

For a description of AUT-68 in PTSs without parameters, we must first 
make a translation of the expressions in AUT-68 to typed A-terms. This 
translation is very straightforward: 


Definition 5.20 We define a mapping [...] from the correct expressions 
in E (relative to a book B and a context T) to T, the set of terms for PTSs. 
We assume that C U VY C Y (Y is the set of variables for PTS-terms). 


def 
eZ=xforzrzey,; 
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e DOr ED 

e [x:2]0 = 1:30 if [2:2] has type type, otherwise [x:I]N ef 
AN, 

e (ME EEN. 


-—— def 
Moreover, we define: type = x. 


In the second clause of this definition we see that the parameter mecha- 
nism of Definition 5.1 is replaced by repeated function application in PTSs. 


With this translation in mind, we want to find a type system A68 that 
“suits” AUT68, i.e. if © is a correct expression of type 2 with respect to 
a book % and a context I’, then we want ®',I’H ©: Q to be derivable in 
A68, and vice versa. Here, B’ and I’ are some suitable translations of ® 
and I. The search for a suitable A68 will concentrate on three points, which 
we first discuss informally. In the next section we give a formal definition 
of A68, and prove that it has the desired property. 


5bl The choice of the correct formation (II) rules 


When we keep in mind that type = «, the definition of correct expressions 
5.11 gives a clear answer to the question of which II-rules are implied by 
the abstraction mechanism of AUT-68. The rule 


B;TF %ı:type B;T,r:Dı + Qy:type 
B;TH [x:4,]N1 : type 


immediately translates into II-rule (*, x, *) for PTSs: 


BTH Ei: BT: + Ore 
BT  (Hz:2,.Q1) : + , 
where B and T are suitable translations of B and T. 
It is, however, not immediately clear which II-rules are induced by the 
parameter mechanism of AUT-68. 
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Let © = b(Dı,... , En) be a correct expression of type 2 with respect 
to a book B and a context I. rhere is a line (see Definition 5.10) 


(ana) 


in B such that each 5; is a correct expression with respect to B and T, and 
has a type that is definitionally equal to a;lrı,... , £ti-1:=E1,.-. , Di-]. 
We also know that N =ga Salzı,--- , £n:=¥1, . . - Un]. 
Now © = 63) --- Èn, and, assuming that we can derive in A68 that X; 
has type 
lei. rei, isis 


it is not unreasonable to assign the type Ilxı:& ---Irn:@,tob.=. We will 
abbreviate this last term by ]];_, 2::%.22. Then we can derive (using n 
times the application rule that we will introduce for A68) that © has type 
Q in A68. 

It is important to notice that the type of b, I], zi:%.=2, does not 
necessarily have an equivalent in AUT-68, as in AUT-68 abstractions over 
type are not allowed (only abstractions over expressions © that have type 
as type are possible — cf. Definition 5.11). In other words, the type of b, 
Fi; Ti: A.E, is not necessarily a first-class citizen of AUT-68 and should 
therefore have special treatment in A68. This is the reason to create a 
special sort A, in which these types of AUT-68 constants and definitions 
are stored. This idea originates from Van Benthem Jutting, and was firstly 
presented in [5]. 

If we construct IIx„:%,.22 from Ez, we must use a rule (s1, 52, 83), where 
$1, 52,53 are sorts. Sort sı must be the type of tn. As an = type or a, has 
type type, we must allow the possibilities sj = * and sı = O. Similarly, 
SZ, = type or =» has type type, so we also allow s2 = * and s2 = O. As we 
intended to store the new type in sort A, we take s3 = A. 

For similar reasons, we introduce rules (*, A, A) and (O, A, A) to con- 
struct Į [i tinin from Irn: On.-22 for n > 1. 

As a result, we have the following H-rules: 


(*,*,*); 
(x,x, A); (0,*, A); 
(*,0,A); (0,0,A); 
(*, A, A); (O, A, A). 
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We do not have rules of the form (A, 52,53) or (s},A, 83) with 53 = * 
or s3 =D. So types of sort A cannot be used to construct types of other 
sorts. In this way, we can keep the types of the A-calculus part of AUT-68 
separated from the types of the parameter mechanism: The last ones are 
stored in A. 

In Example 5.2.4.8 of [5], there is no rule (*,*, A). In principle, this 
rule is superfluous, as each application of rule (+, *, A) can be replaced by 
an application of rule (*, *,*). Nevertheless we want to maintain this rule: 


e First of all, the presence of both (+*,+*,*) and (*,*, A) in the sys- 
tem stresses the fact that AUT-68 has two type mechanisms: One 
provided by the parameter mechanism and one by the A-abstraction 
mechanism; 


ə Secondly, there are technical arguments to make a distinction between 
types formed by the abstraction mechanism and types that appear 
via the parameter mechanism. In this thesis, we will denote product 
types constructed by the abstraction mechanism in the usual way (so: 
IIz:4.B), whilst we will (from now on) use the notation 9x: A.B for 
a type constructed by the parameter mechanism. Hence, we have 
for the constant b above that b : 2; 24:0;-=23. As an additional 
advantage, the resulting system will maintain Unicity of Types.* This 
would have been lost if we had introduced rules (*, +, *) and (*, *, A) 
without making this difference, as we can then derive both 


a Fa: * at, za hart 


ane  (IIa:a.a) : * (rule (x, *, *)) 


and 
a F Q: a, Ta ha 


aux H (Ix:a.a) : A BE 


è There is another reason to make a distinction between types formed by 
the abstraction mechanism and types that appear in the translation 
via the definition mechanism. For the moment, we consider AUT- 
68 without so-called [-application. In AUT-68 with Il-application 


Swe use Ti. x;:&5.=22 as an abbreviation for FST Izn: an. E2 
“The system as presented in [5] has Unicity of Types as well, because it does not have 
the II-formation rule (x,*, A) and is therefore singly sorted. 
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(call this system Aur-68II for the moment; see also Section 5d3) the 
application rule of Definition 5.11 
B;T FH Bife] B; TH 2:0 
ST j (E2) 51 : Da [z:=D2) 


is replaced by 


BTE I: STH De: 
B;T H (Da) 21 : (Ea) Q2 = 


but the rule describing the type of b(5,... , En) is the same as the 
rule in Definition 5.11 (parameters). 


So if we want to make a translation of AUT-68II, the application rule 
for IJ-terms has to be different from the application rule for $-terms. 
Without distinction between []-terms and §-terms, it would be im- 
possible to amend the system to represent AUT-68II. Distinguishing 
between Il-terms and -terms makes it possible to obtain a transla- 
tion of AUT-68II from the translation of AUT-68 in a simple way. 


5b2 The different treatment of constants and variables 


When we seek for a translation in A68 of the AUT-68 judgement B; THF 
=: Q, we must pay extra attention to the translation of B, as there is no 
equivalent of books in PTSs. Our solution is to store the information on 
identifiers of B in a PTS-context. Therefore, contexts of A68 will have the 
form A;T. The left part A contains type information on primitive notions 
and definitions, and can be seen as the translation of the information on 
primitive notions and definitions in B. In the right part T we find the usual 
type information on variables. 

The idea to store the constant information of ® in the left part of the 
context arises in a natural way. Let B be a correct AUT-68 book, to which 
we add a line (T; b; PN; Z2). Then I = rı:a1,... , Zn:@n is a correct context 
with respect to B, and B;T + Za:type or Ey = type. In A68 we can work 
as follows. Assume the information on constants in B has been translated 
into the left part A of a A68 context. We have (assuming that A68 is a 
type system that behaves like AUT-68, and writing T for the translation 
LIG. En: On of TP): 

AT F 32:5 
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(s = * if B; TH Zo:type; s = O if S, = type). Applying the -formation 
rule n times, we obtain 
ASHIT: A 

(If T is the empty context, then $T.S2 = 32, and =p has type * or O 
instead of A. We write ET for 9%, r:a). As FT.=2 is exactly the type 
that we want to give to b (see the discussion in Subsection 5b1), we use this 
statement as premise for the start rule that introduces b. As the right part 
T of the original context has disappeared when we applied the §-formation 
rules, b: T.22 is automatically placed at the righthand end of A: The 
conclusion of the start rule is 


= 


A,b:4T => + b: {T.Z3. 


Adding b: T.22 at the end of A can be compared with adding the line 
(T; b; PN; Eg) at the end of B. 
The process above can be captured in one rule: 


A; T+ Zo:s1 Ast FT.=2:82 
A, b: T.32;+ b: T.22 
Here sı € {*,O} (compare: Z2:type or = = type) and s2 € {x,0,A} 
(usually, s2 = A; the cases s2 = *, O only occur if T is empty). 
5b3 The definition system 


A line (z1:a1,... ‚En:@n;b; 21; 22), in which b is a constant and Zi € €, 
represents a definition. It should be read as: For all expressions 2),... , Qn 
(obeying certain type conditions), b(Rı,... , Qn) is an abbreviation for 
Sıleı,... ‚2n:=9ı,... , Qn], and has type 


lei; ... ‚En=0ı, ... , Qn]. 
So in A68, the context should also mention that bXj--: X„ “is equal to” 


Sıleı,... ‚„£n:=Äı,... , Xn], for all terms X1,... , Xn. The most straight- 
forward way to do this, is to write 


n won — 
b:= (A nz) : (T 2::%.22) 
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in the context instead of only b: 5, 2:05.29, and adding a 6-reduction rule 
that allows to unfold the definition of b: 


n meme 
å Hb —g A DOG 


whenever b:= evan 2.21) : (Ta riac) EA. 

Unfolding the definition of b in a term bS)---5, and applying £- 
reduction n times results in S1[21:=%1] - - - [&n:=%n]. This procedure cor- 
responds exactly to the ó-reduction 


Ak bi, En) 8 Biens EFT „Siel 


in AUT-68°. 
This method, however, has some disadvantages. 


e Look again at a line (x1:a1,... ‚Tn:@n;b; 21; 22) in an AUT-68 book. 
Then (Ey, , En) has 5% --- Dn as its equivalent in A68. If n > 0, 
the latter A68-term has B = bd -Em as a subterm for any m < n. 
But B has no equivalent in AUT-68: Only after B has been applied 
to suitable terms Umyı,--- , on the resulting term B n41:-: En has 
b(),..., 2) as its equivalent in AUT-68. Hence B must not be 
seen as a term directly translatable into AUTOMATH, but only as an 
intermediate result that is necessary to construct the equivalent of 


the expression b(D1,... , En). B is recognisable as an intermediate 
result via its type {+1 2::%.22, which has sort A (instead of * or 
o). 


The method above allows to unfold the definition of b already in B, 
because 65 --- Um can reduce to (Ai Piot) Di Em, and we 
can ß-reduce this term m times to (Afim4) 2:%.51) [z=]. It 
is more in line with AUT-68 to make such unfolding not possible before 
all n arguments 5j,... „En have been applied to b, so only when the 
construction of the equivalent of b(%},... , En) has been completed; 


e Moreover, Af, 2::%.=ı not necessarily has an equivalent in AUT-68. 
Consider for instance the constant b in the line 


(a:type; b; [z:a]z; [z:ale). 


>We can assume that the z; do not occur in the 5, so the simultaneous substitution 
Sıleı,... ‚£n:=%ı,... , En] is equal to Sılaı:=%ı]- - [and]. 
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In this case, AR, HE = Aai. Aria.x. Its equivalent in AUT- 
68 would be [a:type][z:a]z, but an abstraction [a:type] cannot be 
made in Aur-68.° This is the reason why we do not incorporate 
ME 2:0.21 as a citizen of A68; we feel that this is better than making 
it a (first-class or second-class) citizen of A68. 


Therefore we choose a different translation. The line 
(Bes En:Qn; b; 21; E2), 


where = € €, will be translated by putting 


instead of 
nr = n Ba 
= (3 zn) : (3 zca E) 
zel al 
in the left part of the translated context A. And a reduction rule 
bX +++ An 5 Ss Tn: =X. , Xn] 


is added for all terms X1,... , Xn. The symbol § is used instead of A. This 
is to emphasise that, though both $z:A and Ar: A are abstractions, they are 
not the same kind of abstraction. 


5c A68 


5c1 Definition and elementary properties 


We give the formal definition of A68, based on the motivation in Section 
5b. 


Definition 5.21 (A68) 


This situation can be compared to the situation in Section 5b1, where we found that 
the type of b is not necessarily a first-class citizen of AUT-68. There, we could not avoid 
that the type of b became a citizen of A68 (though we made it a second-class citizen by 
storing it in the sort A). 
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1. The terms of A68 form a set 7 defined by 
T:=V|C|S|TT|AVTT|WT.TIW:TTT|W:TT, 
where S is the set of sorts {*, 0, A}. 


We also define the sets of free variables Fv(T) and (“free”)? constants 
FC(T) of a term T in the straightforward way; 


2. We define the notion of context inductively: 


e Ø; Ø is a context; DOM (Ø; Ø) = Ø; 

e If A;T is a context, x € Y, x does not occur in A;T and A € T, 
then A;T,x:A is a context (x is a newly introduced variable); 
DOM (A; T') = DOM (A;T)U {z}; 

ə If A;T is a context, b € C, b does not occur in A;T and A € T 
then A,b:A;T is a context (in this case b is a primitive con- 
stant; cf. the primitive notions of AUTOMATH in Section 5al); 
DOM (A,b:A;T) = DOM (A; T) U {b}; 

e If A;r is a context, b € C, b does not occur in A;T, A €T, 
and T € T, then A,b:=T:A;T is a context (in this case 6 is 
a defined constant; cf. the definitions of AUTOMATH in Section 
5al); DOM (A, b:=T:A;T) = DOM (A; T) U {b}. 


Observe that a semicolon is used as the separation mark between the 
two parts of the context, and that a comma is used to separate the 
different expressions within each of these parts. 


We define 
PRIMCONS (A;T) {be DOM(A;T) | b is a primitive constant}; 
DEFCONS (A;T) {be DOM(A;T) | b is a defined constant}; 
FV(A;T) = DoMGI); 


I 


ll 


3. We define the notion of -reduction on terms. Let A be the left part 
of a context. If (b:= (851 i: A6. T): (fi; z:4:.B)) € A, where B is 
not of the form {y:B,.B2, then 

AF bXı X 5 Elise mein Xn] 


TOf course, to call a constant “free” is a bit peculiar, since there are no bound 
constants. 
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for all X1,...Xn ET. 


We also have the usual compatibility rules on 6-reduction. We use 
notations like u De, as usual. When there is no confusion 
about which A is considered, we simply write 


bXy Xn 6 Tiri... else ‚Anl; 


. We use the usual notion of P-reduction; 


. Judgements in A68 have the form A;T F A: B, where A;T is a 


context and A and B are terms. In the case that a judgement A;T FH 
A: B is derivable according to the rules below, A;T is a legal context 
and A and B are legal terms. We write A;THA:B:C if both 
A; TH A: Band A;T+ B:C are derivable in A68. 


Here are the rules for A68 (v, pc, and dc are shorthand for variable, 
primitive constant, and deftned constant, respectively): 


(Axiom) 5 ee ARS 


ATH A:s 


(Start: v) AlaArs:A 


where s = +, 0O 


(Start: pc) 


(Start: dc) 


: A;,TEM:N ATEAss 
(Weak: v) A;T,:AFM:N 


where s = +, D 

AFM: N ATEB: 81 Ask F.B : s3 
(Weak: pc) T.B; F L u AE 
where s = *, 0 


(Weak: dc) 
AEM: N A\THFT:B:sı A; FT .B : 55 


A, b:=(§ T.T): (T.B) -M:N 
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where sj = #, 0O 


AGT HA: « AT, x:AFB:* 
A;T F (IIz:A.B) : * 
AT HA: sj AGT ,r:A bh B: 53 
= Ar gee SD 
(T-form) A;TF (92:4.B): A 
where sj = #,0 


(II-form) 


0) ATAB ATzAtP:B 
T H (Az:A.F) : (Iz: A.B) 

(Appı) MEN A 

(App) ALE NEN A 

(Conv) Ai MA A 


The newly introduced variables in the Start-rules and Weakening- 
rules are assumed to be fresh. Moreover, when introducing a variable 
x with a “pc”-rule or a “dc”-rule, we assume x € C, and when intro- 
ducing x via a “v”-rule, we assume z € V. 


We write A;T Hygg A: B instead of A; TH A: B if the latter gives rise to 
confusion with other derivation systems. 

Notice that there is no rule ($). This is because we do not want that 
terms of the form §x:A.B are first-class citizens of A68: they do not have 
an equivalent in AUTOMATH. 

Many basic properties for Pure Type Systems also hold for A68 and 
can be proved by the same methods as in the standard literature on PTSs. 
Due to the split of contexts and the different treatment of constants and 
variables, these properties are on some points differently formulated than 
usual (see Section Ad of the Appendix). 


Lemma 5.22 (Free Variable Lemma) Assume A; M:N. Write 
A = b1:Bı,... ‚bm:Bm; T = 21:A1,... ‚In:An (in A, also expressions 
b;:=T;:B; may occur, but for uniformity of notation we leave out the :=T;- 
part). Then: 


e The bı,... ‚dm E€ C and xı,... ‚In € V are all distinct; 


188 5 Automath 


e FC(M),FC(N)C {by,... bm}; FV(M),FV(N) C {21,... ,2n}; 
e b1:Bı,... ,bi—1:Bi-1;F B;:s; for some s; € {*,0, A}; 
A; 21:A1,--- ,25-1:A;-ı F Aj:t; for some t; € {*, 0}. 
& 


Lemma 5.23 (Start Lemma) Let A;T be a legal context. Then A;T + 
*: O, and if b:A E€ A:T, or :=T:AE A, then A;T He: A. R 


The following lemma is not a basic PTS-property. However, it can be seen 
as an extension of the Start Lemma. 


Lemma 5.24 (Definition Lemma) Assume 
Arche ( $ zcAT) ( q AB) AT MN, 
i=1 i=1 


where B is not of the form Qy:Bı.Ba. Then A1;x1:Aı,... ‚m: An} T:B: 
s for ans € {*,0}. M 
The Transitivity Lemma must be formulated in a somewhat different 

way than usual (cf. A.24). This has to do with the fact that contexts may 
contain definitions. To the usual formulation 

“Let A1;Tı and Ao;T2 be contexts, of which A1;Tı is legal. 

Assume that for all b:A € A2;T2 and for all b:=T:A € A5; Ts, 

ATi HF 5A. Then As; B:C > A;r FB:C.” 
we must add an extra clause that b is defined in Aı;T', in a similar way as 
it has been defined in A2; I's. In the following example we show that things 
go wrong otherwise: 


Example 5.25 Let 
Ay = bi:*, bo:x, b3:=by :*; 
A» = bı:*, ba:*, b3:=b2:* 


and li = [2 = x3:b3. Notice that all the assumptions of the traditional 
formulation of the Transitivity Lemma (see above) hold for A1;Tı and 
Ao; T's. Nevertheless, we can derive 


Ao; Is F x3: ba 
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(because A2; I% F x:b3 and according to Az, ba =ga ba, so we can use the 
conversion rule). But we cannot derive 


A1; 1 F x3 : be 
(because bz and bz are not definitionally equal according to Aj). 
The following formulation of the Transitivity Lemma is correct: 
Definition 5.26 We define: A1;Tı + Ag;T. if and only if 
e If b:A € Ao; T, then A,;Tı F 6:4; 
e If b:=T:A € A then A,;T) F bA; 


e If b:=(85 zi : A; U):B € Az and U #5y:B.A' then 
Ai F bx, an =p U. 


Lemma 5.27 (Transitivity Lemma) 
Assume A1; Ii F Ao; To and Agro A B: C. Then Ay; Tı B:C. B 


Lemma 5.28 (Substitution Lemma) 
Assume A;T1,2:A,T3t B:C and A;Tı D: A. 
Then A;Tı,Tale:=D] + Bla:=D}: Clz:=D]. R 


Lemma 5.29 (Thinning Lemma) Let A,;Tı be a legal context, and let 
As;Ta be a legal context such that Ay C Aa and Ti CG To. 
Then A; Tı F A:B>A,; ht A:B.R® 


Lemma 5.30 (Generation Lemma) 


e IfzEV and A;TtHx:C then there is s € {x,D} and B =g6 C such 
that ATF B:sandx«:BeT; 


e fb EC and A;T Hb:C then there is s € S and B =g C such that 
A;THB:s, and either b:B € A or there is T such that b:=T:B € A; 


e IfsES and A;Tt s:C then s =* and C =g5 U; 


e HATH MN: C then there are A,B such that A;T F M : (Ir: A.B) 
or ;THM:($r:A.B), and A;THN:A and C =g B|x:=N]; 
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e If A; T+ (Ar:A.b) : C then there is B such that A;T (IIz:A.B) : x, 
A;T,2:Alk b: B and C =gó IIx:A.B; 


e Assume A;T (Iz:A.B):C. 
Then C =gs *, A; TF A:* and A;T,z:A Bix; 


e FAST + (9z:A.B) : C then C =gs A, AGT F A:sı for some sı € 
{+*,O}, and A;T,x:At B:s for some so € {*,0, A}. 


K 


Lemma 5.31 (Unicity of Types) Jf A;T A: Bı and A;T HF A: Ba 
then Bı =gó Bo. R 


Lemma 5.32 (Correctness of Types) If A;T F A: B then there is 
s E€ S such that B=s or A;TtB:s.R 


From Correctness of Types and the Generation Lemma we conclude: 
Lemma 5.33 If A; TH A: (Ilx:B1.B2) then 

e AGT HB: x; 

e A;T,2:B, + Ba: x. 
x 


Lemma 5.34 If A; TH A: (Fx:B,.Bo) then 
e A; F B, : si for some sı € {*,O}; 
e A;T,2:B, + B2:82 for some sort sz. 


x 


5c2 Reduction and conversion 


In this section we show some properties of the reduction relations —g, —5 
and —gs. As 6-reduction also depends on books, we first have to give a 
translation of AUT-68 books and AUT-contexts to A68-contexts: 
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Definition 5.35 Let T be a AUT-68-context 21:01,... ,£n:@n. Then T = 


ELT, g, En On- 
Definition 5.36 Let B be a book. We define the left part B of a context 
in A68: 


° ae g; 


e BTH PNO) B, b: UTO; 
e BT; 0) EB, 
e BEM) E B, b=: TO. 


Example 5.37 The translation of the book of Example 5.9 is given in Fig- 
ure 9 (because of the habit in computer science to use more than one digit 
for a variable, we have to write some additional brackets around subterms 
like proof to keep things unarnbiguous). We see that all variable decla- 
rations of the original book have disappeared in the translation. In the 
original book, they do not add any new knowledge but are only used to 
construct contexts. In our translation, this happens in the right part of the 
context, instead of the left part. 


Lemma 5.38 Assume, È is a correct expression with respect to a book B. 
1. Dog if and only if © +g X; 
2. BHE 5 W if and only if BEE sD’. 

PROOF: An easy induction on the structure of X. & 


The Church-Rosser property of —gs will be proved by the method of 
Parallel Reduction, invented by Martin-Löf and Tait (see Section 3.2 of 


[4)). 


Definition 5.39 Let A be the left part of a context. We define a reduction 
relation gs (“parallel reduction”) on the set of terms T: 


e For z E€ V, AF T gs T; 


6: ordwrexg Jo uoryepsuea], :6 ams 


prop 
and 
proof 
and-I 
and-01 
and-02 
and-R 


and-S 


*, 


qx:prop.Jy:prop.prop, 

{x:prop.*, 
qx:prop.Jy:prop.[px:(proof)x.Fpy:(proof)y.(proof)((and)xy), 
x:prop.Fy:prop.[pxy:(proof)((and)xy).(proof)x, 
{x:prop.Fy:prop.{pxy:(proof)((and)xy).(proof)y, 
§x:prop.§prx:(proof )x.(and~I)xx(prx)(prx) 
Qx:prop.[prx:(proof)x.(proof)((and)xx), 
$x:prop.$y:prop.§pxy:(proof )((and)xy). 
(and-I)yx((and-02)xy(pxy))((and-01)xy(pxy)) 
{x:prop.fy:prop.[pxy:(proof)((and)xy).(proof)((and)yx) 


c6T 


yjewogny € 
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e For b € C, AF b => 3. b; 
e For s€ 5, AF s gs 5; 
e AFP 56 P' and AQ => B6 Q’, then 
— AF Ar: P.Q ge Ar: P'.Q'; 
= Ar IE PQ >g Iz:P'.Q'; 
= A F APO >g Ter’ 2% 
- A E- PR >56 P'Q'; 
elf AQ gs Q' and AF R gs R, then At (Az:P.Q)R gs 
Q'[zr:=R]; 
© If b:=(§7 1 tc: A. T): (1 vi: As. U) € A, the term T is not of the form 


i=1 
$y:T,.Te,, Ab T Bô T’ and AF Mi > B65 M! fori =1,...,n, then 
AF bMı <- Mn => Bb T'[z1,... ‚In:=M),... ‚ Mi]. 


Some elementary properties of gs are: 


Lemma 5.40 (Properties of >,5) Let A be the left part of a contest. 
For all terms M, N: 


1. AEM =g M; 
2. If Ab M —gs M' then Al M => 5 M'; 
3. If NE M > 5 M' then Ab M —»g5 M'. 
PROOF: All proofs can be given by induction on the structure of M. R 


We conclude from this lemma that —gs (the reflexive and transitive closure 
of —g6) in the context A is the same relation as the reflexive and transitive 
closure of gs in A. Therefore, if we want to prove the Church-Rosser 
theorem for —»g5, it suffices to prove the Diamond Property for +,5. We 
first make some preliminary definitions and remarks: 


Lemma 5.41 (Substitution and =>,5) If A+ M =>g5 M' and A F 
N =>g5 N’ then AF Mly:=N] >g M'[y:=N']. 


PROOF: Induction on the structure of M. B 
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Lemma 5.42 Assume, A and A,A' are left parts of legal contezts, and 
FC(M) C DOM (A). Then Al M > N if and only if A, A’ M > 6 N. 


PROOF: By induction on the length of A and by induction on the definition 
of At M = 5 N. All cases in the definition of A F M = 45 N follow 
directly from the induction hypothesis for A F M gs N, except for the 
case bMı ::: Mn >gs T'[zi,… ‚£n:=M]1,:.. , Mj]. 


As FC(M) C DOM (A), we have b € DOM (A). 
Write A = Al, b:=( ai zi AT): (T zi: AU), Ao. 


e Notice that T is typable in Aj; 21:A1,... ‚In:An (Definition Lemma). 
By the Free Variable Lemma: FC(T) C DOM(A:). By the induction 
hypothesis on the length of A we have Ay + T >56 T' iff AF T >g5 
ae and Ai FT > B65 T' iff A, A’ FT => Bb T'; 

e We conclude: A F T gs T' iff A, A' FT > 65 T'; 


e By the induction hypothesis on the definition of AF M =g6 N, we 
have AH M; Bô M! iff A, A’ FM; => B65 M;; 

Notice that b:=($7_ | 2::A:.T):(Ti-ı 2::A:.U) is an element of both A 
and A, A’. Moreover, b ¢ DOM(A’) (because A, A’ is the left part of a 
legal context). Therefore we have that At 6M,---M, ge N if and 
only if A,A'’HFbMı---Mn=>gs N. 


X 


For left parts A of contexts and for M € T with FC(M) C DOM (A), 
we define a term M4“. In M4, all G-redexes that exist in M are contracted 
simultaneously (this is a usual step in a proof of Church-Rosser by Parallel 
Reduction), but also all 6-redexes are contracted. We will show that A F- 
N => 85 M® for any N with AF M = 5 N; so MA helps us to show the 
Diamond Property for = gs. 


Definition 5.43 We define, for any left part A of a context and any M € T 
such that FC(M) C DoM(A), M4. The definition of MÔ is by induction 
on the length of A. So assume M^ has been defined for contexts A’ that 
are shorter than A. We use induction on the structure of M: 


def 
e zô E x for any ze V; 
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e M =b. Distinguish: 


— bô Ù b for any b € PRIMCONS(A;); 
A E b for any b € DEFCONS(A;) that is not a 6-redex; 


— If b € DEFCONS(A;) is a ĝ-redex, then A = A1,b:=T:U, As, 
where T Æ $y:Tı.Ta. By the Definition Lemma, A,;+ T: U, 
so we can assume that T^ has already been defined. Then 
pê def T^, 


def 
e s& s for any s € S; 


. (Az:P.Q)A dei Ar: PA. QA; 
(IIx:P.Q)? « [12:PA.Q%,; 
(12:P.Q)% © 1z:PA.Q%; 
e M is an application term. We distinguish three possibilities: 
— M = PQ is nota B6-redex. Then we define M4 $ P4Q4, 


— M is a B-redex (Az:P.Q)R. We define MS def QAlz:=R2]; 
— M is a -redex 6M,--- Mn, and 


n n 
A= Ay, b:= ( 8 A:T) i ( q 24.0) ‚As, 
i=1 i=l 
where T is not of the form §y:7}.7>. In that case 
Ars zi:Ai,-..En:ÂntT:U 


(by the Definition Lemma), so we can assume that TA! has 
already been defined. 


Then MA & TAıfzı,... ‚xu:=MA,... ,M2]. 


Lemma 5.44 Let A be the left part of a legal context. At M gs M* 
for all M with Fc(M) C DOM (A). 
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PROOF: By induction on the definition of MA. We only treat the case 
At bMi--- Mn gs (bMi---M,)* where bM; --- Mn is a ö-redex. As in 
the definition of (bM; --- Mn), write 


KER ( § A:T) ( q A.W) Ag: 
i=l i=] 


1= p 
By induction, we may assume that Aj F T >g5 T^! and A F M; ge MA. 
By the Definition Lemma, T is typable in Aj;21:Aj,... ‚£n:An, so by the 


Free Variable Lemma, FC(T) C DOM (A1). By Lemma 5.42, A F T >66 
TÀ, So At bMi--- Mn >p6 TA [z1,... ,2n:=MA,...,MA]. ® 


Theorem 5.45 Let A be the left part of a legal context. Assume FC(M) C 
DOM (A). If Ab M >g5 N then AFN 35 MÊ. 


PROOF: Induction on the the definition of MA. 


e M =x. Then N =z and M4 =z; 
e M =b. Distinguish: 
— b € PRIMCONS(A;). Then N = b and Mô = b; 
— b € DEFCONS(A;), but b is not a 6-redex. Then N = b and 
M$ =b; 
— b € DEFCONS(A;), and A = Aj, b:=T:U, Ao, and T # §y:T).T>. 
Then either N = b or N = T' where T >gs T'. If N = b then 
M =N and we can use Lemma 5.44. If N =T then observe that 
by the induction hypothesis, Ay + T gs T*!, that by Lemma 
5.42 AFT > T^, and that M4 = T^; 

e M =s. Then N = s and Mô = s; 

e M =)z:P.Q. Then N = Ar: P'.Q’ for some P',Q' with A F P gs P' 
and At Q gs Q'. By the induction hypothesis on P and Q we find 
A F P! ge P^ and At Q! >55 Q^. Therefore At \x:P’.Q! = 45 
Az:PA.QÀ. 

The cases M = IIz:P.Q, M = 9x: P.Q, and M = PQ where PQ is not 
a Bó-redex, are proved similarly; 
e M is an application term (and is either a or a ö-redex). Distinguish: 


— M isa B-redex, M = (Az:P.Q)R. Distinguish: 
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* N = (Ar: P.Q)R' for P,Q',R' with A F P gs P’, AF 
Q => 66 Q' and AR Bé R’. By induction, A F Q’ => bő Qs 
and Af R' =g5 Rô. Therefore AF N > Q4[r:=R4]; 
* N =Q'|c:=R’] for Q’, R with AF Q gs Q' and AF R > 
R'. By induction, A F Q’ gs QS and Al R! = 45 RA. By 
Lemma 5.41, AF Q'|x:=R’] >g5 Q4[z:=R4]; 
— M is a 6-redex, M =bM,--- Mn, 


A = Aı,b:= ( 8 AT) : ( q Ail) ‚Aa, 
i=1 i=l 


where T £ §y:T).T2. Distinguish: 

x N = bM{--- M, for M; with A F M; =g M;. By induction, 
we have A + M! gs; MA. By the Definition Lemma, T 
is typable in a context A1;21:A1,... ‚£n:Än, so by the Free 
Variable Lemma, Fc(T) C DOM (A1). By Lemma 5.44, Ay F 
T >gs T^. By Lemma 5.42, At T =g T^. Hence 
AEN Ss Trine Mi res MAT 

* N = T'[z1,... ‚2n:=M]},... , Mn] for a T’ with AF T >g T’ 
and for M; with A M; gs Mj. By the Definition Lemma, 
T is typable in A1; £1:41,.-- ,Zn:Án, so by the Free Variable 
Lemma, FC(T) C DOM (A1). By Lemma 5.42, Ar T gs T’. 
By the induction hypothesis on T, Ay + T’ gs T^. As 
A; FT =g T', Fc(T’) C DOM(Aı), so by Lemma 5.42, 
AL T' =gs Tô. By the induction hypothesis, also A + 
M! gs MA. Repeatedly applying Lemma 5.41, we find® 


AF T' (2x1, as ‚Tn:=Mi, dele MI] > Bé 
Tar … ‚£n:=Mf£,... ‚MM. 
& 
®We must remark that 
T'[zi,... ‚2n:=Mj,... ,M,] = T"[ei:= Mi): fea= Mi] 


and 
Trier... ‚anı=MP,... ‚„M2)] = T^ fay: MP] +: [en =MÂ]. 


This is correct as we can assume that the x; do not occur in the M; and Mf. 
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Corollary 5.46 (Diamond Property for > 5) Let A be the left part of 
a context in which M is typable. 
Assume At M 3,5 Ni and At M = 5 Na. Then there is P such that 
AFN > P and At No => 6 P. 


PROOF: Immediately from the theorem above: Take P= Mô. M 


Corollary 5.47 (Church-Rosser property for -,5) Let A be the left 
part of a context in which M is typable. 

If At M —»g5 Ni and At M —gs Na then there is P such that At 
Ni — B65 P and A H Na — B6 P. 


PROOF: Directly from Lemma 5.40.2, Lemma 5.40.3 and Corollary 5.46. 
K 


5c3 Subject reduction 

Lemma 5.48 (Subject Reduction) 
FAT A: B and A—g A then A;T F A':B. 
PRooF: The proof is as in [5]. R 


Subject Reduction also holds for the reduction relation —§: 


Lemma 5.49 (Subject Reduction for —;) 
If A;TH A: B and A —s Al then A;T H A’: B. 


PROOF: Following the line of [5], we define A; T >; A;T’ iff =T, z:A, Io, 
and I’ = [T], x:A', Iz, and A F A —ș A’. We define A;T —; A’;T similarly, 
and we simultaneously prove 


A;T H A:B and AF A—; A! => AGT A.B 
A;TFA:Band A;T—; AST > ASTEA:B 
A;TF A:B and A;T >; AGI” > A;I’E A:B, 


using induction on the derivation of A;I’F A:B. We only treat the case in 
which the last applied rule is the 2nd application rule, and we only prove 
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the first of the three statements for this case. We write A[x;:=B;]?_,, as a 
shorthand for A|£m:=Bm]|£m4+1:=Bm+1] < [En:=B‚]. We can assume that 


A = An, b= @ zcAT) : (4 1A: B) ‚Aa (1) 
with B Æ @y:Bı.Ba, and that the conclusion of the 2nd application rule is 
A;T KbM- -e Mn: Kn (2) 
for some Ky, and therefore 
AF bM,---M, — Tlei:=Mili-.- 
We must prove: A; T FH T|x;:=M;]'_, : Kn. We do this in two steps. 
1. We analyse the structure of Kn, and derive that 
At Kn =s Blei:=M;];-ı; 
2. We show that A;T + T[r;: =M; : Blei:= Ml. 
Ad 1. | 


We repeatedly apply the Generation Lemma, starting with (2), thus ob- 
taining Kn, Knien , Ki, Kns Kay Ki, In , L1 such that 


A; TF bMy--- Mi-1 : (See: Li.K;); (3) 
ASD F Mi: Li; (4) 

At K; =gs K;lei:=M;j]; (5) 

Ale Reap en GEDE (6) 


We end with A; F b: (Q21:L1.K{). By (1) and the Generation Lemma: 


n 
AF {z1:L1.Ki = 86 5 zj:A;j.B. 
JF 
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By the Church-Rosser Theorem we have Lı =gs Aı and 
AH Ki =p6 q 2; AB. (7) 
j=2 


Hence 


AH @z3:L2.Kz 2: Kı 
er Ge A; 2) (zi: =M,| 
= A 2;:A:[2ı:=Mı].B[cı:=M1], 


so by the Church-Rosser Theorem Lz =gs A2{x1:=Mj]. Proceeding in this 
way, we obtain fori =1,... ‚n: 


AFL; =pê A; ie =M; A =13 (8) 
AL Ki =ø A Ki Aj lou:=Malie-Blan= Mali); 


AK; =Bé A Mo (ze: =M; Blas: =M; 1 
j=i+ 


In particular, 
AF Kn =Bô Ble;:=M;];_.. (9) 
Ad 2. 
Now we calculate the type of T[x;:=M;]?_,. By the Definition Lemma on 
(1) we also have 
At Arne afin An ET SB; (10) 
so by the Start Lemma: A1;21:A1,... ,2;-1:A;_1 F Ais; for sorts s; € S. 
This yields: 
ATH Ay: si (Thinning Lemma); 
A;T,21:Aı is legal (Start Rule); 
A;T,21:A1 z Ag : 82 ( 
A;T, 21:A1, z2:Az is legal ( 


Thinning Lemma); 
Start Rule); 


A;T,21:Aı,... ,2n:An is legal. (Start Rule). 
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Therefore, we can apply the Thinning Lemma to (10), and we find: 
l AT oriee TED: 
As A;T F Mi: Li (4) and A;T F A; : sj, we have A; T + Mi : Aı by the 
Conversion rule and (8), so by the Substitution Lemma: 
A;T, 29:Aq[z1:=Mj],... ‚2n:Anlzı:=Mı] + Tleı:=Mı] : B[eı:=Mı]; 
A;T F Aslsı:=Mj]: s2. 
As A;TH M : Ly (4) and A F Aslcı:=Mı] =g5 Lz (8) we have by 
conversion A; TH Ma: Ag[x1:=Mj], and again by the Substitution Lemma: 
A;T, 13:A3[ai:=Mi]?_,,... ‚En:Anlei:=M;]2ı 
H Tle:=M;]%,: Ble:=M}l;; 
AT + Asleı:=Mil[ee:=M3] : s3. 

Proceeding in this way we eventually find 

AGT H Tle;:=M;];-, : Ble:=M;];-n: (11) 


Applying Lemma 5.32 to (9) we have A; T F Kpn : s. Now use the Conversion 
Rule, (11), and the fact that AF Kny =g5 Blei:=M;]},- ® 


Corollary 5.50 (Subject Reduction for —gs) Jf AST H A: B and 
A —gs A’ then A;T H A’: B. M 


The Subject Reduction Theorem for —4 is used to prove: 


Lemma 5.51 Assume s € S and M legal. 
Then (A M =g55) > M =s. 


PROOF: First assume s € {0,A}. If AST + M : N for some [ and 
N, and A + M =p s then by Church-Rosser A | M —gs s, so by 
Subject Reduction A;T s : N, contradicting the Generation Lemma. If 
AT N:M and AF M =g5 s and M £ s then we have by Lemma 5.32 
that A; M : P for some P, so again A;T F s: P, in contradiction with 
the Generation Lemma. 

Now assume s = +, A; + M : N, and At M =gs s. Again by Church- 
Rosser, AF M og *, say Ab M ss... +85 M' +5 *. By Subject 
Reduction, A; + M': N and A;T + *: N. By the Generation Lemma 
Al N =p O, so N =D. Distinguish: 
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e M' = (Az:A.B)C and * = B[x:=C]. By the Generation Lemma there 
is B’ such that AH B’[x:=C] =g6 O (hence B’[z:=C] = O), A;r H 
(Az: A.B) : (IIz:A.B') and A; H C : A. C = O contradicts A;r FH 
C : A, so B’ =D. By Lemma 5.33 A;r F (IIx:A.D) : +, so by the 
Generation Lemma A;T,x:AHD: x, contradiction; 

e M' = bM,---M, and At bM,---M, —s Tlai:=M;];_, 
argument is similar as in the case M’ = (Ar: A.B)C. 


x. The 


Ifs =*, AI N: M, and AF M =gs s then by Lemma 5.32 M = s 
(and we are done) or A;r HM : s (which implies M = s by the above 
argument). B 


5c4 Strong normalisation 


We prove Strong Normalisation for Gó-reduction in A68 by mapping a ty- 
pable term M (in a context A;T) of A68 to a term |M |a that is typable in a 
strongly normalising PTS. The mapping is constructed in such a way that 
if M >s N, [Mla —5 |Nla, and that if AF M >s N, |M|a —g IN]a. 


Definition 5.52 Let A be the left part of a legal context and let M € T. 
We define |M|, by induction on the length of A and the structure of M. 
e Iz, df for z € V; 


bla df b for all BEC \ DEFCONS (A; ); 


def 
bla = A Ti: lAila, . ITla, 


ifA= Ai, b:= (871 zi: AT): (f BAD). As; 


Isla af 5 fors ES; 


def 
Az:P.Qla = Az: [Pla IQ; 


def 
INz:P.Q|ı = He: |Plı-|Qla; 


def 
e |Gz:P.Q|, = Iz: |Plı- |Qla; 


IPQla = |Pla Qla- 
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The following lemmas are useful: 


Lemma 5.53 Let A be the left part of a legal context and M ET. Then 
Fv(IM|a) = Fv(M). 


PROOF: The proof is by induction on the definition of |M|, and is trivial 
for all cases except the case M = band A = A,,b:=($T.T):{JT.U), A2 
(T#$YT.n). 

By the Definition Lemma, T is typable in A1;T'; therefore Fv(T) C Dom (T) 
(Free Variable Lemma). By the induction hypothesis, FV(|T'|,,) © DOM (T) 
and therefore Fv(]bja) = Ø. X 


Lemma 5.54 If A; and Az are left parts of legal contezts and Ay = Ay, A’ 
then |M|,, =|M|q, for all MET with FC(M) G DOM (Ai). 


PROOF: An easy induction on the definition of |M|,,. B 
Lemma 5.55 Let A be the left part of a legal context. For all M,N: 
|[M[z:=N]|4 = IMI [z:= NIA] 


PROOF: By induction on the definition of |M|,. In the case M = b and 
b:=T:U € A, use the fact that Fv(|M|,) = FV(M) = Ø (Lemma 5.53) and 
therefore |M |, [z:=IN|„J =|M|, =|M[z:=N]|,. & 


The purpose of the definition of |M|, is explained in the following two 
lemmas: 


Lemma 5.56 If M —g N then |M|, 5 INIA. 


PROOF: Induction on the structure of M. We only treat the case M = 
(Ax:P.Q)R and N = Qlx:=R]. 


IM|, = Qe:lPla-IQla)IRla 
>; [QA [z:= RIA] 
= (Qle:=R]|ı. 
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Lemma 5.57 AFM-5N, then |M|, —>s IN |a- 
PROOF: Induction on the structure of M. We only treat the case in which 
M = bM, --: My; 
n n 
A = Aı,b:= ( 8 A:T) : ( q zAV) ‚Aa; 
i=l i=1 
N = T|z1,... ,2n:=M,... , Mn]. 


Notice that 


Ma = (Arla, Tla) Mila Mala 
B la; [2::= |Mila l 
Ss IT |, [&:= Mila lier 
= Te:=MlRıla 


= |T[z1,... „En:=Mi1,.… ‚Malla - 


At the last equivalence, we must make a remark similar to footnote 8 on 
page 197. W 


Let ASN be the PTS over A-terms with variables from V UC and sorts from 
S, and the following rules (we choose the name ASN because this system 
will help us in showing that A68 is SN): 


(*,*, *); 
(+,#,A); (0, *, A); 
(*,0,A); (0,0, A); 
(x, A, A); (o, A, A). 


This is in fact the pure type system that is based on the II-formation 
rules that were proposed in Section 5b1. ASN is contained in the system 
ECC (see [85]). As ECC is -strongly normalising, also ASN is 8-strongly 
normalising. 

We present a translation of A68-contexts to ASN-contexts: 


Definition 5.58 Let A;T be a legal A68-context. 
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e We define |A] by induction on the length of A: 


- |a| = ø; 
- JA, b:U| JAL, b: Ul; 


def 
= |Al; 


— [A,b:=T:U]| 


e IfI =1::4ı,... ,£n:Án then |A;T| a [Al z1: | Atlan ss nr Ask 


We see that definitions b:=T:U in A are not translated into |A|. This 
corresponds to the fact that all these definitions are unfolded (replaced by 
their definiendum) in |b|,. 

Now we are able to prove the most important lemma of this subsection: 


Lemma 5.59 If A;r Fsg M : N then |A;T| Fysn |M a : |Na- 


Proor: The proof is by induction on the derivation of A;T M : N. We 
treat a few cases: 


(Start: Primitive Constants) 


A;r Faes B: s1 A;tıss {E.B : s2 


= *,D). 
A,b: 4 .B;F x68 b: II.B Gree) 


By the induction hypothesis, |A| Fysn |TT.B|ı : s2, so by the Start 
rule: 


|A| ,b:/4T.Bl, F b:|{F.B] 


Observe that |A, b: {T.B| = |A|, b: |TT.B|,, that lola e: {rB = band 
that (by Lemma 5.54) |TT.B|ı =| DB, agr 


(Start: Defined Constants) 


A;T Fass 7: B: si A; y68 T.B : s2 


A, b=(0.T):(4T.B); Fass b : IT.B He), 


By induction we have 


A; | Fısn |IT.B|ı : so, 


206 5 Automath 
so (write I = 21:Aj,... ,2n:An): 
|A; | Fysn U Adla-IBla: $2- (12) 


By induction, we also have |A;T| Fysn |T|ı : |B], so: 
IA], zi: (Arla »--- „Zn: |Anla Pasy ITla : |Bla (13) 
and by repeatedly applying the A-rule on (13) and using the fact that, 


by the Induction Hypothesis, the types ]];-; 25: |A;lı - |Bla are all 
typable, we find: 


[Ail hasn (A, 2s lila -Ila ) + (Ë selila -IBla) 
= il (14) 


(Application 1) (the Application 2-case is similar) 


A;T Fes M: (IIz:A.B) A;T Faes N:A 


A;T Fıss MN : Ble:=N] 


By the induction hypothesis, we have 
IA; T] Fasn IMla : (Hz: |Al, -|Bl,), 
and {A;I| Fysn IN], : |Ala- The application rule gives 
|A; Pl Fasn [Mla INIa:: [Bla [2:= |All. 
Use the definition of |MN|, and Lemma 5.55 to obtain 
|A; Pl Fysn [MN], : |Ble:=Allı- 
x 


Corollary 5.60 (Strong Normalisation) A68 is 36-strongly normalis- 
ing. 
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PROOF: Assume, we have an infinite Gó-reduction path in A68: 
Mı — ps Mo — Bs M3 gs :.: (15) 


As ö-reduction is strongly normalising (5.17 and 5.38.2), there must be 
infinitely many -reductions in this reduction path, so we have a path 


Ni +; Ni —s No 8 N; 2 N3 4 N3 TS... 
By Lemmas 5.56 and 5.57, this gives us a reduction path 


which is an infinite -reduction path in ASN. By Lemma 5.59, |Nı|, is a 
legal term in ASN. But as ASN is strongly normalising, the above infinite 
B-reduction path cannot exist. Hence, the infinite Gé-reduction path (15) 
does not exist, either. Š 


5c5 The formal relation between AUT-68 and A68 


Theorem 5.61 Let B be an AUTOMATH book and T an AUTOMATH con- 
text. 


e [FB;THauT_eg OK then B;T is legal; 

e IB; T Hayre ©: then BITE: ON. 
PROOF: We prove both statements simultaneously, using induction on the 
derivation of B;T FAUT-683 OK and B;TH%:0 of Definition 5.10 and 
Definition 5.11. We only treat one case; the other cases are similar or 


trivial. Assume, the last step of the derivation has been an application of 
the book extension rule def2: 


B; I HAur-es Ze:type BT Faur—es Did) BT FAUT-68 U2 =p Xh 
B, (T; k; 21; 22); Ø FAUT-68 OK l 


By the induction hypothesis, we have 


B; I Hass Da: + (16) 
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and 

B;T Hass D1 : E}. (17) 
By Lemma 5.38, we have 

B tyes Dr =p5 Ep- (18) 
Applying the conversion rule of A68 to (16), (17) and (18) yields 

®B;T Haeg X1 : Lo. (19) 


Notice that B;T is legal, so for each x: € T (say: T = I}, r:a, I2) we have 
%;T) ka: s for an s € {x,O}, by the Free Variable Lemma 5.22. Thus we 
can repeatedly apply the {-formation rule (starting with (16)) to obtain: 


B; Hass FT.D2: A (20) 


(ET = Ø then we apply the 4-formation rule zero times, and the type of 
ST.S is x instead of A). Now we can apply the (Start: dc) rule on (19), 
(16) and (20) to obtain: 


B; k:=(§ T.IN):(IT.I2); Hass k: YTD, 
so B, (T; k; 51; $2); = B, k:=(§ T.51): (4 T.£2); is legal. R 


It is possible to prove a conservativity theorem (in the style: If B;T Hass 
X: Q, then B;T Faures © : Q), but we want to prove that all the typable 
terms of A68 have some interpretation in AUT-68, and not only the terms 
that have an equivalent in AUT-68. We have to distinguish six different 
cases, and the interpretation of these six cases is given after the proof of 
the next theorem. 


Theorem 5.62 Assume A;T Hagg M : N. Then there is an AUTOMATH 
book B and an AUTOMATH contest IY such that B;I' Hayr_og OK, and 
BI’ = A;r. Moreover, 
1. If N =O then M =+; 
2. FAT Fasa N : D then N = x and there is Q € E such that Q=M 
and B;T’ FAyT-eg 2: type; 
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3. If N = A then there is I" = x1:%1,... „En:Pn and QE EF such that 


e T’, T” is correct with respect to B; 
e M = {I".O; 
e Q = type or B;T' FAYT-e8 2: type; 


4. If A;T Faes N: A then there are b E C and Iı,... ‚Xn E E such that 
M = bX --- En. Moreover, B contains a line 


(21:91,.-. ,Em:Nm; b; 21; 22) 
such that 


e m > n; 
o BI Pages len... sure ‚Sil A <i<n); 


° di = (Mene: 2:01.32) ler; hink „Zn: =], a a]; 


5. If N = * then there is Q € E such that Q = M and B;T' H auT_es 
Q : type; 


6. If A;T Fass N : * then there are EQ € E such that © = M and 
N =N, and B;T' Fyr-og X : Q, and B; T” F4uyT-es 2: type. 


PROOF: We use induction on the derivation of A;T Fass M : N. We only 
treat a few cases: 


Weakening: definitions The last step in the derivation has been 


As agg M:N A;TFyes T: B: 81 A; 168 qI.B: s2 
A, b:=(§T.T):(FU.B);Fx6g M : N 


where sı = * or sj =D. Use the induction hypothesis and determine 
B, I’, X1, E2, Q1, and Ny such that B = A, I =T, 4) =T, X = B, 
Qı = M and Qz = N. We know by induction that B; I” FAUT-es 2 : 
type (if sı = *) or Xz = * (if s2 = O). Also, B; I” FAUT-6g Xi : Do. 
This makes it possible to extend B with a new line, thus obtaining 
a legal book %, (I’; b; £1; X2). Using Weakening for AUT-68 (Lemma 
5.19) and the induction hypothesis on A;Fygg M : N, it is not hard 
to verify the cases 1-6 for A,b:=($T.T):(QT.B);Fıes M : N; 
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Application 2 The last step in the derivation has been 


A;T Haeg Mı : (Ix:A.B) A;T Fagg Ma: A 
A;T Fiss Mi Ma : Ble:=M;)] i 


Determine B, I” such that B = A and IY = I. By Correctness 
of Types 5.32 and the Generation Lemma 5.30, we have A;T kyg¢g 
({z:A.B) : A, so by the induction hypothesis (case 4), there are 
b, Di, , En such that M, = 65] ---D, and there is a line 


rra , Em: Nm; b; 21; 22) 


in B such that m > n, B;I' Hayes Files: 5,52 i fori = 
l,...,n, and 


q@xz:A B= ( q oF) [ey DF. 
ien+l 


Observe: A = pyre; =E} SR =, As BT’ FAUT-68 Qn4i : type or 
Qn41 = type, we have A;T Fass Qn41 : s for an s € {#,0}, and 
by Substitution Lemma and Transitivity Lemma we have A;T Fagg 
Raile: =E] : s, hence A;T Hogg A: s. 

With the induction hypothesis we determine D € £ such that 


B; IT” taut—eg X : Hl], 


and My = È. We now treat the most important ones of the cases 1-6: 


4. The only thing that does not directly follow from the results above 
ism >n+1. Assume, for the sake of the argument, m =n +1. 
Then B[zx n = Slaz:=3;]7- As A; T Haes B[e:=M3J: A 
Zele;:=5; 2, l is of the form @x:P.Q, which is impossible; 

6. Notice: Blx an = (1 RB esn =. We have 
AST Fass Bix:=M3] : *. Therefore B[r:=M3] cannot be of the 
form {y:P.Q, and therefore m = n +1. Therefore, B;T’ FAUT-68 
b(X), sane hy Dti) 5 Sl. 
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Remark 5.63 We give some explanation to the different cases mentioned 
in the formulation of Theorem 5.62. 


e The cases N = O and A;r + N : O imply that there are no other 
terms in A68 than * itself at the same level as x. This corresponds to 
the fact that type is the only “top-expression” in AUT-68; 


e The cases N = * and A;T F N : » give a precise correspondence 
between expressions of AUT-68 and terms of A68: If M : N in A68 
then there are expressions ©, Q in AUT-68 such that © : Q in AUT-68 
and S=M and Q = N; 


e The cases N = A and AGT FN: A cover terms that do not have an 
equivalent in AUT-68 but are necessary in A68 to form terms that have 
equivalents in AUT-68. More specific, this concerns terms of the form 
(i, v::A;.B (which are needed to introduce constants) and terms 
of the form bM,---M,,, where b is a constant of type Mi, 7;:A;.B 
for certain m > n (which are needed to construct \68-equivalents of 
expressions of the form b(D1,... , Um)). 


We conclude that A68 and AUT-68 coincide as much as possible, and 
that the terms in A68 that do not have an equivalent in AUT-68 can be 
traced easily (these are the terms of type A and the terms of a type N : A, 
and the sorts D and A, which are needed to give a type to * and to the 
{-types). 

Notice that the alternative definition of b-reduction in A68, discussed 
at the end of Subsection 5a3, would introduce more terms in A68 without 
an equivalent in AUT-68, namely terms of the form Aji, 7;:A;.B. 


5d Related work 


The system AUT-68 is one of several AUTOMATH-systems that have been 
proposed. Another frequently used system is AUT-QE. In Section 5d1 we 
compare AUT-68 to AUT-QE and describe how we can easily adapt A68 to 
a system AQE. 

Recently, various type systems with definitions in PTS-style have been 
proposed by, amongst others, Bloo, Kamareddine and Nederpelt ([16, 17]) 
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and by Severi and Poll ({114]). The presentation of AUT-68 in the PTS- 
like system A68 makes a good comparison between these systems and the 
definition system in AUT-68 possible. This will be done in Sections 5d2 and 
5d3. 


5dl AuT-QE 


The system AUT-QE has many similarities with AUT-68. There are a few 
extensions: 


1. We can also form abstraction expression [x:%]type (thus extending 
Definition 5.1); 


2. Inhabitants of types of the form [z:L]type are introduced by extend- 
ing the abstraction rules 1 and 2 of Definition 5.11 with the following 
rule for AUT-QE: 


B;T F &itype B;T,r:d, F Da:type 
B; T F [z:X,]Y2 : [z:Zi]type 
Notice that the expression [z:5}]type is not typable, just as type is 


not typable. In a translation to a PTS, these expressions should get 
type O; 


3. There is a new reduction relation on expressions, which is specific for 
AUT-QE and therefore will be called >QE in the sequel. The relation 
is described by the rule 


[21:21] - [&n:En][y:2]type >QE [21:21] - - [vn:Z,] type 
(for n > 0). 


The first two rules are rather straightforward. They correspond to an ex- 
tension of A— to AP in Pure Type Systems. It is also easy to extend A68 
with similar rules: We just add the II-formation rule (*, O, O): 


A;TFA:* A;T,2:AFB:0O 
A;T+ (IIz:A.B): 0 


In AUT-68 PAT is implemented in De Bruijn-style (see Section 4a4 and 
Example 5.9). An implementation of predicate logic in Howard-style is 
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not possible in AUT-68, but due to the extension with types of the form 
[2:2]type, such an implementation becomes possible in AUT-QE. See [39]. 

The third rule deserves some extra attention, as it is very unusual. It 
is needed in AUT-QE because that system does not distinguish between As 
and Hs. In AUT-68 this did not matter, as from the context it could always 
be derived whether an expression [x:5]Q should be interpreted as Ar:D.Q 
or as I1r:D.0. The latter should have type type, and the first should not 
have type type. 

In AUT-QE the situation is more complicated. A expression [r:2]0 may 
have more than one type: 


Example 5.64 Let ® consist of two lines: 
(2, Q =, type), 
(a:type, 2, —, Q). 
Notice that, using rule (abstr.1) of Definition 5.11, we can derive that 
B; a:type For [v:a]a : type. (21) 
But using the new abstraction rule of AUT-QE we can also derive 
B; a:type Foe [z:a]a : [zraltype. (22) 
More generally, we can prove that the two statements below are equiv- 
alent (that is: if either of them is derivable then they are both derivable) 
in AUT-QE: 
BT Foge [1:1] e [EnEn] : [21:04] --- En: En]type; (23) 
BT For [r1] e [EnEn] : [81:21] Em: Em]type (24) 
(for m < n). In (23), the expression [x1:21] ++ [&n:&n]f? should be read as 
Ni 2:24.02; in (24) it should be read as AL, 74:55 [jemy] 235-8. 
But this equivalence holds only for expressions of the form 


[e121] [En En] 


and not for general expressions E (take, for instance, E a variable). In order 
that the equivalence holds for general expressions ©, De Bruijn introduced 
a rule for type inclusion: 
BT For &: [21:21] [£n En]type 
B;T Foe E: [z151] e [En En]type" 
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Lists of abstractions [21:1] [&n:En] were also called telescopes by de 
Bruijn. In the rule for type inclusion, we see that one part of the telescope 
“collapses”. 


5d2 Comparison with the DPTSs of Severi and Poll 


In [114], Severi and Poll present an extension of PTSs with definitions, thus 
obtaining Pure Type Systems with Definitions (DPTSs). They extend the 
usual PTS-rules with the following D-rules: 


Tha:A 
(D-start) I, r= AFT: A 
TrTŁ-b:B Tha: A 
(NER)  TasaArb:B 


T,x=a:At-B:s 


(D-form) TH(z=a:Ain B):s 
(D-intro) [,z=a:Atb:B Ph (x=a:A in B): 


TF(z=a:A in b): (c=a:A in B) 
(D-conv) PEGE r - Bis Tr- B=pB' 
where D-reduction is defined by the following rules: 
ri, z=: A, Is F 2 >p a; 
T F- (z=a:A in b) >p b (x ¢ Fv(b)); 


T,z=a:Ak b >p b 
TH (z=a:A in b) >p (z=a:A in b') 


and the usual compatibility rules. As we see, there is an extra class of terms 
in DPTSs, namely those of the form (z=a:A in b). 
When regarding both systems we find that: 


e In DPTSs, definitions do not only occur in a context, but may also 
occur in terms. Moreover, definitions may disappear from contexts 
when they are introduced in terms (e.g. the D-form and the D-intro 
rules, and the last of the three D-reduction rules), and definitions may 
disappear from terms when the definiendum does not occur in that 
term (the middle D-reduction rule). 
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This gives definitions a more temporary character: We can use them 
as long as needed, and when we do not need them any more, we can 
remove them from the context. 


Definitions can also play a more local role: A definition that is needed 


in only one term can be imported into that term while it is not nec- 
essary to carry it around in the (global) context, as well. 


This temporary and local behaviour of definitions is not present in 
AUTOMATH; 


Due to the fact that definitions can also play a local role, D-reduction 
can also unfold definitions which are not present in the (global) con- 
text, but which are given within the term. For example, we have 
one H (id=Ar:a.z in id) +p Ax:a.2, though there is no definition of 
id in the context a:x. 


Again, this is not possible in AUTOMATH; 
The start rule for definitions in DPTSs, 


Pers 
T,2=T:Btrr:B 


does not require TH B : s for a sort s. In A68 we have the rule (Start: 


de): 
A,THFT:B:sı A;k FT.B : se 


A,2:=80P.T: T.B; 2: {T.B 
where we see that both B and ¢IT.B need to be of a certain sort (and 
B must be of sort * or Q); 


(sı = *, o) 


The start rules for definitions in DPTSs and in A68 also differ in 
another respect, namely the type of definiens and definiendum. In 
DPTSs they have the same type (in the notation of the previous 
paragraph: B), while in A68 the definiens T has type B and the 
definiendum x has type T.B. This topic has already been discussed 
when we introduced the definition mechanism of A68 in Section 5b3; 


D-reduction differs from ö-reduction, also when only global definitions 
are taken into account. For instance, ö-reduction is substitutive, i.e. if 
A F As A then Ah Als:=b] >; A'[z:=b) (proof: Induction on the 
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structure of A). D-reduction is not substitutive: take = a:*, y=a:*. 
Then DF y >p a, but T Y yla:=M] >p ala:=M] for arbitrary M. 


In A68, this example would look as follows. Take A = y:=a:Ja: + .*. 
Then At ya —s a and AF yala:=M] >s ala:=M]. 


Substitutivity for —p is lost, because unfolding a definition by D- 
reduction may introduce new free variables in the term. In Au- 
TOMATH, all free variables in the definiens must be added as pa- 
rameters to the definiendum. In A68 this is visible in the Start and 
Weakening rules for defined constants: The right part T of the context 
A;T that is used to type the definiens T in these rules, serves as list 
of parameters in the definiendum. When an AUTOMATH-definition is 
unfolded, the free variables occurring in the definiens are replaced by 
the parameters; 


We see that the definition of y in A68 in the example above is more 
general than in the corresponding DPTS situation. In the DPTS- 
example, y D-reduces to one, fixed term «. In the A68 version, yM is 
defined for any (typable) term M. To do something similar in DPTSs, 
one needs to define y as Aa:*.a. In particular, one needs to type the 
term Aa:*.a, which involves the use of [I-formation rule (0,0), so 
the use of a higher type system. One could say that AUTOMATH 
and A68 use an implicit A-abstraction where DPTSs need an explicit 
A-abstraction. On this point, AUTOMATH and A68 are more flexible 
than DPTSs. This is due to the parameter mechanism of AUTOMATH. 
It is possible to extend DPTSs with a parameter mechanism as well. 
This will be the main topic of Chapter 6. 


We summarise the differences between DPTSs and AUTOMATH: 


e DPTSs have global and local definitions. AUTOMATH has only global 


definitions; 


e In DPTSs, the type B of a definition c=T:B does not have to be 


typable itself. In AUTOMATH, B has to be typable; 


e The D-reduction of DPTSs+is not substitutive; 6-reduction of AU- 


TOMATH is substitutive; 
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e AUTOMATH has a parameter mechanism, DPTSs do not have such a 
mechanism. 


5d3 Comparison with systems of Bloo, Kamareddine and 
Nederpelt 


In [17], Bloo, Kamareddine and Nederpelt extend the usual PTSs with 
both II-conversion and definitions. [17] starts with PTSs extended with 
TI-reduction, but without definitions (see [73]). This system (which we will 
call ABU for the moment) does not have the Subject Reduction property. 
For instance, one can derive 


ark, za F (Ay:a.y)x : (Iy:a.o)r, 
but it is not possible to derive 
act, vra t g : (Ily:a.o)e. 


Adding a definition mechanism results in a system that we will call AGITé 
and is the main point of interest in [17]. As a sort of “side effect” of adding 
this definition mechanism, AGII6 has Subject Reduction. 

It will be clear that it is useful to take [J-conversion into consideration 
when comparing AUTOMATH with ASII. Though our system A68 does not 
have Il-conversion, it is very easy to extend it to a system AII68 by: 


e Changing rule (Appı) into 


A;THFM:Ilx:A.B ASTEN: A 
A;THMN: (Ir:A.B)N 


(Rule (App,) remains unchanged — see also the discussion in Section 
5b1); 


e Adding a new reduction rule —p by 


(IIz:A.B)N >n Ble:=N). 


The system AII68 is actually much closer to AUT-68 than A68 as AUT-68 
has Il-conversion as well. 
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In AII68 we do not have Subject Reduction, either: It is not hard to 
derive 
sank, 2a (Ay:a.y)z : (Hy:a.a)z 


in ATI68. Nevertheless, we can not derive 
‚am, za hz: (Iyaa) 


(In such a derivation, no definitions can occur: Definitions, once they have 
been introduced, cannot be removed from the left part of the context any 
more; when we are not allowed to use any definition rules, AII68 has not 
more rules than the system ASII of Bloo, Kamareddine and Nederpelt). 

The “restoration” of Subject Reduction in A@IId is only because of 
the special way in which definitions are introduced and removed from the 
context. We do not go into details on this; the interested reader can consult 
[17]. 

Another main difference between AII68 and A@IId has already appeared 
in Section 5d2: In ATI68 there is a different correspondence between the 
types of definiendum and definiens than in Aplld. 


Conclusions 


In this chapter we described the most basic AUTOMATH-system, AUT-68, 
in a PTS style. Though such descriptions have been given before in, for 
example, [5] and [54], we feel that our description is more accurate than 
the two ones cited above. Moreover, our description pays attention to the 
definition system, which is a crucial item in AUTOMATH. The descriptions 
mentioned above do not. 

A68, the main topic of this chapter, does not include II-conversion (while 
AUTOMATH does). However, it is very easy to adapt A68 to include H- 
conversion (this was done in Section 5d3 to compare our system to the 
system in [17]). 

The adaption of A68 to a system AQE, representing the AUTOMATH- 
system AUT-QE is not hard, either: It requires adaption of the II-formation 
rule to include not only the rule (x, *, x) but also (x, D, O) and introduction 
of the additional reduction rule of type inclusion. l 

Of course, the properties of A68 presented in Section 5c have to be 
reviewed for these new systems. 
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When comparing A68 to other type systems with definitions, we find an 
important difference. In A68, the correspondence between types of definien- 
dum and definiens differs from the similar correspondence in the systems 
in [114] and [17]. 

The reason why A68 differs from other theories in this respect has been 
discussed in Section 5b3: The definition system in AUTOMATH allows pa- 
rameters to occur in the definiens, and there is no parameter mechanism in 
PTSs. In Chapter 6, we extend PTSs with a parameter mechanism. This 
extension has AUT-68 as a subsystem. Moreover, we show that a parameter 
mechanism has also other advantages. 


Chapter 6 


Pure Type Systems with 
Parameters 


This chapter is devoted to the description of pure type systems with pa- 
rameters. One reason to study this extension of PTSs is to give a better 
description of AUTOMATH than in the previous Chapter, where we had to 
work with the sort A to store terms and types that did not have a coun- 
terpart in AUTOMATH (cf. Subsection 5b1). Such terms and types were 
needed for the description of the system because no parameters were used. 
But there are many more arguments why type systems with parameters 
deserve to be studied: 


Definitions The various AUTOMATH systems had mechanisms to incor- 
porate parameters and definitions in the formal language (as we saw 
in the previous chapter). There are also modern systems in which 
definitions are part of the formal machinery of the system (see [114], 
[16]). We will show that the (now widely accepted) system of Severi 
and Poll [114] can be easily extended with a parameter mechanism; 


Programming languages Parameters and parametric definitions are not 
only used in implementations of type systems. They also occur in 
many other parts of computer science. For example, look at the 
following Pascal fragment P with the function double: 


function double(z : integer) : integer; 
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In a PTS with definitions like the one in [114], P could be represented 
by the context declaration 


double = (Az:Int.(z+z)) : (Int — Int). 


Of course, this declaration can imitate the behaviour of the function 
perfectly well. But the construction has the following disadvantages: 


e The declaration has as subterm the type Int — Int. This sub- 
term does not occur in P itself. More general, Pascal does not 
have a mechanism to construct types of the form A — B; 


e Moreover, due to the way in which double is defined, double is 
a separate subterm in a PTS. But double itself is not a separate 
expression in Pascal: you can’t write x := double in a program 
body. One may use the expression double in a program, pro- 
vided that one specifies a parameter p that serves as an argument 
of double. 


We conclude that the translation of P by means of the context decla- 
ration above is not fully to the point. The extension of the system of 
[114] with a parameter mechanism, to be presented in this chapter, 
allows us to translate P by the parametric context declaration 


double(z:Int) = (z+z) : Int. 
This declaration does not have the disadvantages described above: 


e It doesn’t have the subterm Int — Int; 


e As we will show in this chapter, double itself cannot be a sub- 
term of a term. We always have to specify an argument p for 
double, thus constructing a subterm double(p); 


First-order logic Implementations of first-order logic in a PTS in PAT- 
style usually use a PTS that is related to AP. AP has sorts *, O, 
axiom *:D, and two II-formation rules, (+, *,*) and (x, 0,0). In this 
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PTS it is possible to construct types (that is: terms of type * or 0O) 
that are not in -normal form, Hence, a derivation in AP can have 
non-trivial applications of the conversion rule 


IA: B: Tk B:s Bı =, By 
TELA: B, i 


This can be problematic in implementations. In theory, it is always 
decidable whether two terms B1, Bz are f-equal or not (simply: check 
whether their -normal forms are syntactically equal or not). In 
practice, such a calculation may take quite some time and memory. 
Therefore, it would be better to use a PTS in which applications of 
the conversion rule are only possible when Bı = By. This is the 
case if all types in such a PTS are in -normal form. As all types 
in A— (that is: AP without I-formation rule (+,0,0)) are in Ó- 
normal form, it would be a good candidate for an implementation of 
first-order predicate logic. Unfortunately, first-order predicate logic 
cannot be described in PAT-style in A—. The introduction of the re- 
lation symbols in a first order language involves the II-formation rule 
(+, 0,0). 

But in a first-order language, a relation symbol R always has a fixed 
arity a(R). This means that R itself is not a proposition. It can only 
be used to construct a proposition: if tj, ,ta(p) are terms, then 
R(tı,..- ‚ta(r)) is a proposition. With the use of parameters in PTSs, 
it is possible to introduce the relation symbols without II-formation 
rule (*, 0,0). This results in a system in which the conversion rule is 
superfluous, and therefore easier to handle in implementations. See 
Section 6f; 


Philosophical arguments The parameter mechanism enables us to de- 
scribe the difference between developers and users of certain systems. 
We illustrate this by expressing the different attitudes of logicians and 
mathematicians towards the induction axiom for natural numbers. A 
logician is someone developing this axiom (or studying its proper- 
ties), whilst the mathematician is usually only interested in applying 
(using) the axiom. 


Assuming a variable N (the type of natural numbers) of type x, a vari- 
able 0 (representing the natural number zero) of type N and a variable 
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S (an implementation of the successor function: Snm is assumed to 
hold if and only if m is the successor of n) of type N — N — x, the 
induction axiom can be described by the following PTS-type (let’s 
call it: Ind): 


Ip:(N#*).P0— (In:N.IIm:N.pn>Snm-pm)—In:N.pn 


in a PTS with sorts *, O, axiom x : O and II-formation rules (+, *, *), 
(*,0, 0), (D,*,*). With this type Ind one can introduce a variable 
ind of type Ind that may serve as a proof term for any application of 
the induction axiom. This is the logician’s approach. 


For a mathematician, who only applies the induction axiom and 
doesn’t need to know the proof-theoretical backgrounds, this inter- 
pretation is too strong. Translating the mathematician’s conduct to a 
PTS-like setting, we may express this as follows: The mathematician 
uses the term ind only in combination with terms P : Nx, Q : P0 
and R : IIn:N.IIm:N.Pn=Snm=Pm to form a term indPQR of 
type IIn:N.Pn. In other words: he is only interested in the applica- 
tion of the induction axiom, and treats it as an induction scheme in 
which values P,Q, R have to be substituted to use it. 


The use of the induction axiom by the mathematician is therefore 
much better described by the following, parametric, scheme (p, q and 
r are the parameters of the scheme): 


ind(p:N—x, g:p0, r:(IIn:N.IIm:N.pn—Snm—pm)) : In:N.pn. 


If now P: Nox, Q:P0 and R: IIn:N.IIm:N.Pn>Snm—Prm, then 
one can form the term ind(P,Q,R) of type IIn:N.Pn. The types 
that occur in this scheme can all be constructed using sorts *, 0, 
axiom * : O and rules (x, *, *), (x, 0, 0), hence the rule (O, x, *) is not 
needed (in the logician’s approach, this rule was needed to form the 
Il-abstraction IIp:(N > *)---). 


Consequently, the type system that is used to describe the mathe- 
matician’s use of the induction axiom can be weaker than the one 
for the logician. Nevertheless, the parameter mechanism gives the 
mathematician limited (but for his purposes sufficient) access to the 
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induction scheme. Without parameter mechanism, this would not 
have been possible. 


We see that the parameter mechanism enables us to describe the 
difference between a user of a system (in this example: the mathe- 
matician) and a developer of the same system (in this example: the 
logician). In this light it is interesting to note that AUTOMATH, which 
has a parameter mechanism, was developed from the viewpoint of 
mathematicians (see [23]); 


A different form of abstraction and application In A-calculus with- 
out parameters there is one mechanism for abstraction and applica- 
tion. For abstraction, we use A-abstraction, and application is imple- 
mented via function application. Abstraction and application form 
the basis for a type system. A parameter mechanism is a different 
abstraction-and-application mechanism. In the philosophical argu- 
ment above, the parametric scheme for induction could only be used 
when parameters were supplied. In other words: abstraction is al- 
lowed, but has to be followed immediately by application. In the 
perspective of our study of the various ways in which application and 
abstraction are present in type theory, we conclude that this mecha- 
nism for combined abstraction and application, being different from 
the A-calculus mechanism, deserves our attention. 


We conclude that there is ample motivation to extend PTSs with parame- 
ters. 

There are several ways in which such an extension can be made. For 
instance, when working in the systems of the Barendregt Cube, we may 
want to add only parametric terms t(pı,...,Pn) for which the parameters 
P1,---‚Pn have types Aj,..., An that are of sort *. But we could also decide 
to add parametric terms £(pı,...,?n) without this restriction to the types 
of the pı,...,Pn- 

There is a method to classify these various parametric extensions that 
corresponds to the classification of type systems that is used in the frame- 
work of Pure Type Systems. 

In the Barendregt Cube, there are two sorts «x and O, and the vari- 
ous PTSs in the cube are determined by the various ways in which type 
abstractions can be made. If all constructions of II-types are allowed, we 
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obtain the Calculus of Constructions, with rules (*, *,*), (*, 0, 0), (O, *, *) 
and (0,0,0). If we do not allow all I-type constructions, we get one of 
the subsystems of the Calculus of Constructions in the Barendregt Cube. 

Something similar can be done with the parameter mechanism. One op- 
tion is to provide one, general way of parametric abstraction and parametric 
application. We then allow all kinds of parameters. On the other hand, 
there are several ways in which a parameter mechanism may be restricted. 
We mention two ways: 


+ Assume, we are working in one of the systems of the Barendregt Cube, 
extended with parameters, and we have that t(p1,...,Pm) has type 
A. By Correctness of Types, A has either type * or type O. One can 
imagine that we only allow t(p1,...,pm) if it has type A of type x (so 
we only allow parametric terms); 


e Still working in one of the systems of the Barendregt Cube extended 
with parameters, we will show that the parameters pı,...,?m in a 
term t(p1,...,Dm) are typable themselves. Again, a parameter p; can 
have a type P; of type x (so p; is at term level), or a type P! of type 
O (so pi is at type level), and there are systems in which one would 
only allow parameters p; that have a type P; of type * (or of type 0). 


These two possibilities for restriction are orthogonal in the sense that they 
can be combined. In many Pascal versions, for instance, parametric terms 
can only have parameters at term level. It is, for instance, not possible 
in Pascal to write a function CartProd that takes two types A and B as 
parameters, and returns a type that represents the Cartesian product Ax B 
of A and B. 

It is possible to incorporate such restrictions in our system in a similar 
way as the restrictions on the formation of Il-types in PTSs. We then 
obtain rules for parameter constructions. These rules have the form (s1, s2). 
The sort sı indicates that the parameters p,,...,Dm have to have types 
P,,...,Pm of sort sy. The sort sa indicates that the resulting parametric 
term must have a type P of sort sg. The combination of the rules for 
parameter constructions with the well-known rules for the construction of 
Il-types in the Barendregt Cube leads to a division of the Barendregt Cube 
into eight sub-cubes (we illustrate this in Figure 11 on page 278). As in the 
Barendregt Cube, one dimension in the cube still corresponds with one of 
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the rules (+, O), (O, *) or (0, 0). Following an edge of the cube in dimension 
(s1, $2) can now be done in two ways: 


e As was already possible, we can follow the edge to the end. This still 
corresponds to accepting the II-formation rule (s1, s2, 52); 


e We can also follow the edge only half-way. This means that we do 
not accept the II-formation rule (sj, s2, 52), but that we do accept the 
parameter construction rule (sj, s2). 


This viewpoint suggests that allowing the II-formation rule (s1, s2, s2) also 
allows the parameter construction rule (s 1,52). Formally, one can work 
with systems in which we do allow the II-formation rule, but do not allow 
the parameter construction rule. We can prove, however, that if the II- 
construction rule (s1, 52, $2) is allowed, a parameter construction involving 
rule (s1, $2) can be imitated by A-abstractions (Theorem 6.79). 


This chapter is organised as follows. In Section 6a, we give definitions of 
PTSs extended with parametric constants and definitions. This definition 
includes an extension of the 6-reduction described in [114] (which unfolds 
definitions) to parametric definitions. In Section 6b we show that the 6- 
reduction and Gó-reductions have the Church-Rosser property, and that 
6-reduction (under some reasonable conditions) is strongly normalising. In 
Section 6c, we show some elementary properties of the system introduced in 
Section 6a, like a Generation Lemma, and the Subject Reduction theorem 
for Gó-reduction. We also prove that Gó-reduction is strongly normalising 
if a slightly stronger PTS is G-strongly normalising. 

Section 6d is devoted to the various ways in which parameters can be 
added to a PTS in a more restricted way, with the refined Barendregt Cube 
of Figure 11 as a result. 

In Section 6e, we compare our system with some other type systems, like 
AUTOMATH. We place various AUTOMATH systems in the refined Baren- 
dregt Cube of Figure 11. 

In Section 6f we see that the use of parameters can sometimes result in 
simpler and more realistic implementations of type systems. 


6a Parametric constants and definitions 227 


6a Parametric constants and definitions 


In [114], PTSs extended to include definitions are abbreviated as DPTSs. 
In this section we extend PTSs with parametric constants and definitions. 
This extension will also contain the DPTSs (definitions in DPTSs can be 
interpreted as parametric definitions with zero parameters). In Section 6e, 
we show that AUT-68 can be seen as a (on some points somewhat restricted 
version of a) PTS with parameters and definitions. 


Definition 6.1 The set Tp of parametric terms is defined together with 
the set Cy of lists of variables and the set Cr of lists of terms: 


Tp s= V|S|C(Lr)| TpTp | AV:Tp.Tp | 
IIV:7p.Tp | C(Ly)=Tp:Tp IN Tp; 

Lv. = Ø | (Ly, V:Tp); 

Lr == @| (Lr,Tp). 


where, as usual, V is a set of variables, C is a set of constants, and S is a 
set of sorts. Formally, lists of variables are of the form 


(...((9,&%1:A1),22:A2)... En: An). 


We usually write (z1:A1,...,En:Án) or even 71:A1,...,£n:An. A similar 
convention is adopted for lists of terms. In a parametric term of the form 
c(bi,...,dn), the subterms bı,...,d„ are called the parameters of the term. 


Terms of the form C(£Ly)=Tp:Tp in Tp represent parametric local def- 
initions. An example of such a term is double(x:N)=(x+x):N IN A. The 
term indicates that a subterm of A of the form double(P) is to be inter- 
preted as P + P, and has type N. The definition is local, that is: the 
scope of the definition is the term A. Local definitions stand in contrast to 
global definitions. Global definitions are given in a context I, and refer to 
any term that is considered within T (see the forthcoming Definition 6.8). 
The definition system in AUTOMATH can be compared to the system of 
global definitions in this Chapter. However, there are no local definitions 
in AUTOMATH. 
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Definition 6.2 


e We extend the definition of FV(A), the set of free variables of a term 
A, to parametric terms: 


FV(c(a1,...,@n)) Ui, FV(ai); 
FV (c(#:A)=A:B IN c) = UR EVA) \{a1,...,2-1}) 
U(FV(A)UFV(B))\ {zi , zn} 
UFV(C) 


where 2:A denotes 21:A1,...,2n:Anj 


e We similarly define cons (A), the set of constants and global defini- 


tions of A: 
cons (s) = cons (x) = 9; 
CONS (c(a1,...,@n)) = {c} L UL, CONS (a;); 
CONS(AB) = cons(A)U cons (B); 
CONS (Àz:A.B) = cons(A)Ucons(B); 
CONsS(IIz:A.B) = CONs(A)UCONs(B); 


CONS (c(@:A)=A:B IN c) = U; Cons (A;) 
U CONS = U cons (B) 
U (cons (C) \ {c}). 


Fv(A) U cons (A) forms the domain DOM (A) of A. 


Remark 6.3 The definition of 
Fv(c(é:4)=A:B IN c) 
and 
CONS (c(@:A)=4:B IN c) 
make clear what the binding structure in a term c(#:A)=A:B IN C is. 


e A variable declaration 2;:A; in the parameter list #:A binds all the 
occurrences of z; in Aj, for 7 > i. That is: the type of a parameter 
x; may depend on earlier declared parameters; 
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Moreover, the declaration x;:A; binds all the occurrences of x; in 
A and B. This corresponds to the intuitive idea of a parametric 
definition: x; can serve as a parameter in the definiens A and in the 
type B of the definiens; 


However, the variable declaration z;:A; does not bind any occurrence 
of x; in C. The definiendum c will occur in C only with a list of 
parameters a1,...,4n behind it, so in the form c(aı,...,an). The 
variables 1),...,2n in the definition of c only serve to indicate what 
the type of the a;s must be (below, we will see that a; must have type 
Aleea ha and what the type of the term c(aı,...,Qn) is (this 
appears to be B[z;:=a;];_,); 


Moreover, we see that c is not included in the constants of 
e(:A)=A:BınC. 


This is because c is a local definition, and acts as a binder for the 
occurrences of cin C. 


Remark 6.4 There are several reasons for including the type B in a local 
definition e(£:A)=A:B IN ©: 


We want to remain consistent with other binders, such as A and II. 
In a term Ar: A.B or IIx:A.B we mention the type of the binder x, 
therefore we also mention the type ofthe binder c in a local definition 
c(#:A)=A:B IN C; 


Sometimes A : B indicates that the term A is a proof of a theorem 
B (using PAT). If we want to use B in the proof of a new theorem 
B', we must use the proof term A of B in the proof A’ of B’. In 
that case it is attractive to abbreviate A by introducing a definition 
c(#:A)=A:B IN A’. It is important to remember that c is (an abbre- 
viation of) a proof of B, and that is a reason to mention B, the type 
of A, in the definition declaration; 


For practical purposes like proof assistants or proof checkers, it may 
seem to be problematic to have B in the definition declaration. How- 
ever, the program does not always have to ask the user to explicitly 
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mention the type of the abbreviation. Often it can find this type 
itself via a type checking algorithm. Of course, this also depends on 
whether type checking is decidable in the underlying type system. 


Sometimes, the user may wish to manually enter the type, because 
he/she may prefer a certain formulation of the type to a G-equivalent 
formulation that the program automatically offers. 


As usual in PTSs, we do not make difference between terms that are 
equal up to renaming of bound variables: we consider these terms to be 
syntactically equal. Moreover, we assume the Barendregt variable conven- 
tion: 


Convention 6.5 Names of bound variables and constants will always be 
chosen such that they differ from the free ones in a term. 


Hence, we do not write (Ax:A.x)x but (Ay:A.y)x. Similarly, we write 
c(x’:A)=x':A IN c(x) instead of c(x:A)=x:A IN c(x). 


Definition 6.6 We extend the definition of substitution of a term a for a 
variable x in a term b, b[x:=a], to parametric terms, assuming that x is not 
a bound variable of either b or a: 
cb1,...„On)[e:=al = c(bile:=a],..., bn[x:=a]); 
(c(#:A) = A:B IN C)[e:=a) = cx1:Aıle:=a],...,2n:An[x:=a])= 
Alz:=a]:Blx:=a] IN CIe:=a]. 


We now define contexts for type systems with parameters and defini- 
tions. 


Definition 6.7 The set of contezts is given by 
Cp = Ø | (Cp, V:Tp) | (Cp, C(Ly )=Tp:Tp) | (Cp,C(Ly):Tp). 


Notice that Ly C Cp: all lists of variable declarations are contexts, as 
well. We denote contexts by T,I’,.... 


Definition 6.8 Let T be a context. Elements z:A, c(z1:Bı,... , £n:Bn):A, 
c(21:B1,...,£n:Bn)=a:A of T are called declarations. 
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e x:A is a variable declaration. 


— The variable z is the subject of the declaration; 


— A is the type or predicate of the declaration; 


e A declaration of the form c(z1:Bı,...,2n:Bn):A is a constant decla- 
ration. 


— The constant c is the subject of the declaration. As c is intro- 
duced without further definition, c is called a primitive constant 
(cf. the primitive notions in AUTOMATH); 


— T1,---,En are the parameters of the declaration; 
— Ais the type (predicate) of the declaration; 


e A declaration c(z1:Bı,...,£n:Bn)=a:A is called a global definition 
declaration or shorthand global definition or definition. 


— The constant c is the subject or definiendum of the declaration. 
cis called a (globally) defined constant; 
— 21,...,£n are the parameters of the declaration; 


— aisthe definiens of the declaration; 


A is the type (predicate) of the declaration. 


The reasons for including the type of a global definition or a parametric 
constant in its declaration are the same as for local definitions. See Remark 
6.4. 

In the rest of this chapter, A denotes a context 21:B),...,2n:Bn con- 
sisting of variable declarations only. Such a context is typically used as a 
list of parameters in a definition c(A)=a:A. We write 


A; = zı:Bı, we. ‚&G_1:Bi-1 


fori <n. 
We extend the definition of substitution to contexts: 
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Definition 6.9 Let T € Cp, M € Tp. We define T[x:=M] as follows: 


22:=M) = 2; 
(T,2:A)[e:=M] = Tle:=M]; 
T, x Ayz:=M] = (Tie:=M],x:A[r:=M]) if z #2’; 


(T,c(A):A)[x:=M] 
(T,c(A)=a:A)[x:=M] 


(T[z:=M], c(A[z:=M]):Alz:=M)); 
(Tlz:=M], c(Ala:=M])=a[r:=M]: Alz:=M]). 


lil 


For a term A we defined Fv(A) and cons (A). For a context T we do 
not form one set CONs (T), but we split this set into a set PRIMCONS (T), 
containing the primitive constants of T, and a set DEFCONS (T), containing 
the defined constants of T. 


Definition 6.10 Let T be a context. We define the free variables, con- 
stants and definitions of I: 


PRIMCONS (I) DEFCONS (T) 


@ Ø a] a] 

Tyee Fv([) U {x} | PRIMCONS (T) DEFCONS (T) 
T,c(A):A Fv(T) PRIMCONS (T) U {c} | DEFCONS (T) 

T, c(A)=a:A | Fv(T) PRIMCONS (T) DEFCONS (T) U {c} 


Finally we define the domain of T, DOM (T), by 
FV(T) U PRIMCONS (T) U DEFCONS (T). 


In ordinary Pure Type Systems we have that, for a legal term A in a legal 
context T, Fv(A) C FV(T). The type of a free variable in A, therefore, can 
always be determined via I’. In our pure type systems with definitions and 
parameters we will have: FV(A) C FV(T) and cons (A) C PRIMCONS (T) U 
DEFCONS (T). This has not only as an effect that the type of a free variable 
or a constant can be determined via I’, but also that T determines whether 
a constant in A that is not serving as a local definition within A, is a defined 
constant or a primitive constant. We therefore define: 


Definition 6.11 For a context [ and a term A with DOM (A) C DOM (I) 
we define 


DEFCONSrp( A) CONS (A) N DEFCONS (T); 
PRIMCONSr(A) = CONS(A)MPRIMCONS(T). 
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We see that a constant c € C can play three roles in a term A, with 
respect to a context I: 


e If c occurs in a subterm (c(A)=b:B IN a) of A, then c is a locally 
defined constant; 


e If c € DEFCONSr(A), then c is a globally defined constant; 


e If c € PRIMCONSr(A) (or c g DOM (T)), then c is a primitive constant. 


Example 6.12 It is possible that c € CONS(A) is a globally defined con- 
stant with respect to a context T, but a primitive constant with respect to 
a context I’. Take for example A = id, T = a:x, id()=(Ax:a.x):(a — a), 
and I’ = a:x, id():(a — a). 


A natural condition on a context Tı,c(A)=a:A,T' is that all the free 
variables and constants of a and A are declared in either T; or A, and 
that all free variables and constants in a declaration x;:B; € A are declared 
in T},A; (recall that A is a standard context x1:B1,...,£n:Bn and A; = 
x&1:Bı,...,%-1:Bi-ı). We call such a context sound: 


Definition 6.13 T € Cp is sound if T =T,,c(A)=a:A,T2 implies 
DOM (a) U DOM (A) C DOM (T1) U DOM (A) 


and 
DOM (B;) C DOM (T;, A;) . 


The contexts occurring in the type systems proposed in this chapter 
are all sound (see Lemma 6.23). This fact will be useful when proving 
properties of these systems. 

We will consider some extensions of Pure Type Systems (PTSs). The 
definition of PTSs can be found in the appendix (Definition A.20), and has 
already been discussed in Section 4bl. 


e An extension that includes globally and locally defined constants is 
described and studied in [114]: “PTSs with definitions” (D-PTSs); 
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Orthogonally, we can extend PTSs with parameter-free primitive con- 
stants. Then we obtain C-PTSs. C-PTSs are not very interesting, as 
the role of parameter-free primitive constants can usually be imitated 
by variables.} One could agree that a parameter-free primitive con- 
stant is a special sort of variable, and promise not to make any (À or 
II) abstraction over such a variable; 


Our first real extension describes PTSs with parametric primitive 
constants, but without definitions (C-PTSs). The C-PTSs include 
the C-PTSs, as a parameter-free primitive constant can be seen as a 
parametric primitive constant with zero parameters; 


Another extension includes parametric defined constants, and can be 
seen as a generalisation of D-PTSs: D-PTSs; 


We can combine the extensions with primitive constants and defined 
constants, choosing between parametrised or parameter-free variants. 
For instance, we can make an extension that includes parameter-free 
defined constants, and parametric primitive constants. We call this 
extension CD-PTSs. 


Combining the various extensions, we obtain a hierarchy that can be de- 
picted as in Figure 10. 


Example 6.14 We give some examples of the possibilities of parameters 
and definitions. 


We illustrate the difference between PTSs, C-PTSs and C-PTSs. 


— In the PTS A— (with only one axiom * : D and one II-formation 
rule (*,+#,*)) we could introduce a type variable N : * and a 
variable o : N when we want to work with natural numbers. 
N represents the type of natural numbers and o represents the 
natural number zero; 


— Though the representation of objects like the type of natural 
numbers and the natural number zero as a variable works fine in 


1 There are, however, extensions of PTSs in which constants play an essential role. See 
for instance the Modal PTSs in the thesis of Borghuis [18], p. 28-29 
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CD-PTS 


LN 


CD-PTS CD-PTS 


T 


Č-PTS CD-PTS D-PTS 


ar Se 


C-PTS D-PTS 


hd 


Figure 10: The hierarchy of parameters and definitions 


practice, there is a philosophical problem with such a represen- 
tation. We do not consider the set N and the number 0 € N to 
be variables, because these objects “do not vary”. If we have a 
derivation of N:*,o:N Ht: N for some term t, it is technically 
possible to make a A-abstraction over the variable o and obtain 
Nix + Ao:N.t: N — N. In this judgement, o acts as a variable, 
while it was initially introduced as a constant. 


In C-PTSs we can distinguish between constants and variables. 
If o is introduced as a constant, it is not possible to form a 
A-abstraction Ao:N.t; 


— In Example 5.9, we introduced for each proposition © the type 
proof(D) of proofs of ©. This cannot be done in the PTS A ex- 
tended with (unparametrised) constants: such a constant proof 
should be of type prop — type and this type cannot be con- 
structed in A— (notice that type = *, so the construction of 
prop — type would involve the II-formation rule (*,0,0)). 
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However, the term proof will hardly ever be used on its own. 
It is usually used when applied to a proposition X. In C-PTSs 
it is possible to introduce a parametric version of proof by the 
following context declaration: 


proof(p:prop) : type. 


This does not involve the construction of a type prop — type. 
Nevertheless it is possible to construct the term prop(P) for any 
term P : prop. We obtain a form of polymorphism without using 
the polymorphism of A-calculus. 

A disadvantage may be that we cannot speak about the term 
proof “as it is”. When using proof in the syntax, it must always 
be applied to a parameter T : prop. 

However, an advantage is that we can restrict ourselves to a 
much more simple type system. In the situation above we remain 
within the types of the system A—. We do not need to use types 
of the system AP. This may have advantages in implementations 
of type systems. For instance, the system A-> does not involve 
the conversion rule 


A: B THB:s B=, B' 
THA: B 


while AP does involve such a rule. The conversion rule involves 
-equality of terms, and though it is decidable whether two A- 
terms of AP are f-equal or not, it may take a lot of time and/or 
memory to establish such a fact. This may cause serious prob- 
lems when implementing certain type systems. Using parameters 
whenever possible may therefore simplify implementations. We 
give an example in Section 6f; 


e We illustrate the difference between PTSs, D-PTSs and D-PTSs. 


— In a simple PTS like A— one can derive the following statement 


for an identity function: 


art (Axax)ı a > a; 
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— The same derivation can be made in the corresponding D-PTS, 
but in that D-PTS we have the possibility of abbreviating the 
term Ax:a.x. We can do this in two ways. First of all, we can 
introduce this definition in the context: 


aux, id=(Ax:a.x):(a-a) F id: ama, 
But we can also decide to make a local definition: 


aux H (id=(Ax:a:.x):(a—a) IN id): 
id=(Ax:a.x):(a—-a) IN aa. 


We see that the definition of id appears both in the term and in 
the type of the term, but not in the context. 


The advantages of definitions are: 


o We can abbreviate long expressions. This makes terms more 
surveyable: id is shorter than Ax:a.x; 


o We can give names to important expressions. This makes 
terms more understandable: id expresses that we have to do 
with the identity function, whilst Ax:a.x does not express 
this fact; 


— In a D-PTS we have more options for abbreviating the identity 
function. 


o First of all, we can make the same derivation as in the D- 
PTS. Formally, there is a small difference: we cannot use 
id but must work with id(), a parametric term with zero 
parameters (as in D-PTSs we can only work with parametric 
definitions). We obtain (in the case of the global definition): 


art, id()=(Ax:a.x): (aa) F id() : aa; 
o But we could also decide to use one or more parameters in 
the definition of id. For instance, we could parametrise the 


variable œ. This results in the declaration 


id(a:*)=(Ax:a.x):(aa). 


238 


o 


6 Pure Type Systems with Parameters 


If we want to use this declaration, we must have a term T 
of type *. Assuming that we have such a term T, we can 
derive: 


id(a:+)=(àx:a.x):(a—a) + id(T): T >T. 


We see that we obtain a restricted form of polymorphism in 
this way. The type system may not allow the construction of 
Aa:*.Ax:a@.x; nevertheless the parameter mechanism makes 
it possible to express id(T') for any type T : x; 

We could also decide to parametrise the variable x, and leave 
the variable a unparametrised. This yields a context 


art, id(x:a)=x:a. 


We see that the A-abstraction Ax:a.x is parametrised now. 
The definition declaration means: For any term t of type a, 
the term id(t) of type a is defined by t. If we have such a 
term t, then we can derive 


ax, id(x:a)=x:ab id(t) : a. 


Observe that id(t) does not have type a — a (as was the 
case with id) but type a (which would also be the type of 
idt if we had used the identity id=Ax:a.x from A-calculus); 
Finally, one could parametrise both a and x. This results 
in a declaration 

id(a:*, x:a@)=x:0 


in the context. If we have a term T of type * and a term t 
of type T, we can derive 


id(a:*, x:a@)=x:a F id(t) : T. 


The global definitions given in the D-PTS case could also be 
made local, as was done in the D-PTS case. 


We now start a more detailed description of the various extensions of 


PTSs with definitions and parameters. 
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We define two reduction relations, namely the 6- and g-reduction. f- 
reduction is defined as usual, and we use —g, —g, >, and =g as usual. 
As far as global definitions are concerned, 6-reduction is comparable to 
6-reduction in AUTOMATH. This is reflected in rule (61) in the definition 
below. But now, a 6-reduction step can also unfold local definitions. There- 
fore, two new reduction steps are introduced. Rule (62) below removes the 
declaration of a local definition if there is no position within its scope where 
it can be unfolded (“removal of void local definitions”). Rule (63) shows 
how one can treat a local definition as a global definition, and thus how the 
problem of unfolding local definitions can be reduced to unfolding global 
definitions (“localisation of global definitions”). 

Remember that A = z1:B1,... ,2n:Bn. 


Definition 6.15 We define the following three reduction rules: 
D, c(A)=a:A, Ta P c(bi, ee bn) >8 alz::=bi]?_, (61) 
TE c(A)=a:A INb—5 6 if c ¢ CONS(b) (62) 


[,c(A)=a:A k b 5 b 


TH c(A)=a:A IN b gs c(A)=a:A IN V (83) 


Furthermore, we have some compatibility rules. These rules are not 
very complicated, there are only quite a lot of them. 


Definition 6.16 We define the following compatibility rules: 
T,Aka—s; a’ 
T H c(A)=a:A IN bs c(A)=a:A IN b 


T,At A; A’ 
m EE ORED y E O T EEE 
TF c(A)=a:A IN b 5 c(A)=a:A' IN b 


T, A; B; >; B; 
TH c(A)=a:A IN bs c(x1:Bi, vi: Bi, En: Bn)=a: A IN b 


Da a! Tb sb 
TE ab —s ab TE ab —; ab 
T,2:Ata —;a THA—4 
TF Ar:Aa — Ar: A.’ T E Az:Aa — Ax: A'a 
T,2:Ata ga DA 6 A’ 


FH Ier: Aa —; Ir: Ad Tt IIx: Aa >; IIx: A'.a 
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[Tr a; 6 a’. 
DF clai, An) 6 €{@1,...,05,---)@n) 
Remark 6.17 One might also expect a compatibility rule 


Tt b—, 8 
TH c(A)=a:A IN b 5 c(A)=a:A IN # 


However, this rule is a derived rule (see the forthcoming Lemma 6.26). 
Now we can give a formal definition of 6-reduction: 


Definition 6.18 6-reduction is defined as the smallest relation >; on Cp x 
Tp x Tp closed under the rules (61), (62) and (63) of Definition 6.15 and 
under the compatibility rules of Definition 6.16. 


When T is the empty context, we write a —s a’ instead of T Ha —s a’. 
We extend —; to contexts: 


Definition 6.19 6-reduction between contexts is the smallest relation —s 
on Cp x Cp closed under the following rules: 


Ty H A u) A' 
T1,2:A,Ta 5 T},2:A ‚Ta 


T, A >E Ei 
Ty, c(A):A, Ta 6 Ti, c(A°):A, T? 


T, AF AW; A’ 
Fi, c(A):A, Pa —6 Ti, c(A):A’,T2 


Ti, Akasa! 
Ty, c(A)=a:A, T'2 45 T1, c(A)=a':A,T 2 


Ty A-s5 Ty, A’ 


(ite a ie 


Ti c(A)=a:A,T2 —s Tı, cl A')=a: A, T3 


Ty, ALA; A’ 
Ty, c(A)=a:A,T. “38 Ty, c(A)=a:A’,T2 
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We now describe the extensions to PTSs that are needed to obtain C- 
PTSs and D-PTSs. We don’t discuss D-PTSs and CD-PTSs: D-PTSs are 
introduced in [114] and CD-PTSs can be constructed by extending D-PTSs 
with the additional rules for C-PTSs. 


Definition 6.20 (C-PTS: Pure type systems with parametric con- 
stants) The typing relation HC is the smallest relation on Cp x Tp x Tp 
closed under the rules in Definition A.20 and the following ones (we still 
write A=r1:Bı,... ,£n:Bn): 

= Cy Co; 

(G-weak) TK 6:B PAY A:s 
T,c(A): AF b: B 
T1,c(A):A,Ta re b:Bilaz:=b;]; (i = en) 
e Ti,c(A):A,T2 HÉ A:s ifn =0 
Py,c(A):A,P2 FY c(by,...,6n) : Alej: =b] 

where s € S and the c that is introduced in the C-weakening rule is assumed 
to be T-fresh. 


At first sight one might miss a C-introduction rule. Such a rule, however, 
is not necessary, as c (on its own) is not a term. c can only be (part of) 
a term in the form c{bı,...,5n), and such terms can be typed by the Č- 
application rule. 

The extra condition T},c(A)}:A,T2 FC A: s in the Č-application rule 
for n = 0 is necessary to prevent an empty list of premises. Such an empty 
list of premises would make it possible to have almost arbitrary contexts 
in the conclusion. The extra condition is only needed to assure that the 
context in the conclusion is a legal context. 

Adapting these rules for HC and the rules for definitions of [114] results 
in rules for parametric definitions: 


Definition 6.21 (D-PTS: Pure type systems with parametric def- 
initions) The typing relation HP is the smallest relation on Cp x Tp x Tp 
closed under the rules in Definition A.20 and the following ones: 
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(D-weak) 


(D-app) 
(D-form) 
(D-intro) 


(D-conv) 
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THÖb:B DArda:A 
T, c(A)=a:A H? b: B 
T1,c(A)=a:A,T2 +? brBr b C= han) 


Tı, c(A)=a:A, T? Ha: A (if n = 0) 
Tj, c(A)=a:A, T2 +? c(by,...,bn) : Alej: =b; en 


T, c(A)=a:A Dn B:s 
Tr? c(A)=a:A IN B:s 
T,c(A)=a:At®b:B Tb? c(A)=a:A IN B: 8 
Tre c(A)=a:A IN b: c(A)=a:A IN B 
Tho): B TDHÔB':s THB=4B!' 
TH? >: B’ 


where s € S, and the c that is introduced in the D-weakening rule is 
assumed to be I'-fresh. 


+ includes the definition system of [114]: The D-application rule for 


n = 0 can be seen as the ó-start rule of D-PTSs. 


Definition 6.22 (Pure Type Systems with (parametric) constants 
and (parametric) definitions) 
Let G be a specification (see A.17). 


e A pure type system with (parametric) constants C-PTS is denoted as 


AC(6) and consists of a set of terms Tp, a set of contexts Cp, the 


B-reduction rule and the typing relation HC; 


e A pure type system with (parametric) definitions D-PTS is denoted 


as AP (G) and consists of a set of terms Tp, a set of contexts Cp, ß 


and ö-reduction and the typing relation HB; 


e A pure type system with (parametric) constants and (parametric) def- 
initions CD-PTS is denoted as \CB(6) and consists of a set of terms 
Tp, a set of contexts Cp, 8 and 6-reduction and the typing relation 
HČD which is the smallest relation on Cp x Tp x Tp that is closed 
under the rules of Definition A.20 and the rules of HC and HP. 
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A term a is legal (with respect to a certain type system) if there are T, b 
such that either T Ha :bor rh b: a is derivable (in that type system). 
Similarly, a context T is legal if there are a, b such that TF a:b. 


All contexts occurring in CD-PTSs are sound (see Definition 6.13). As 
CD-PTSs are clearly extensions of PTSs, C-PTSs and D-PTSs, this implies 
that all contexts occurring in PTSs, C-PTSs and D-PTSs are sound. We 
need this fact in many proofs in the next sections. The proof of the lemma 
below is by induction on the derivation of T HĒĎ a: A. 


Lemma 6.23 Assume T „CD b: B. 
1.. DOM (b), DOM (B) C Dom (I); 
2. T is sound. 


PROOF: We prove the statements (1) and (2) simultaneously by induction 
on the derivation of T HCP b: B. We treat the two most important cases: 


e (D-weakening) T,c(A)=a:A F b:B because [+ b:B and T, A F a:A. 
(1) is trivial; (2) follows from the induction hypothesis for (1); 


e (D-formation) T H (c(A)=a:A IN B) : s because T,c(A)=a:At B: s. 
(1) follows from the induction hypothesis for (2); (2) is trivial. 


DI 


6b Properties of terms 


In this section, we prove properties of terms without wondering whether 
these terms are legal or not. In Section 6b1 we discuss some basic properties, 
such as a Substitution Lemma, and substitutivity. Section 6b2 is devoted to 
the Church-Rosser property for Gó-reduction, and in Section 6b3 we prove 
strong normalisation for 6-reduction. 

Though we do not restrict ourselves to legal terms in this section, we 
often demand that the free variables and constants of a term are contained 
in the domain of a sound context. 
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6b1 Basic properties 


In the following lemma we show that a 6-reduction step remains invariant 
if we enlarge the context. The proof is done by induction on the definition 
of 6. 


Lemma 6.24 (—s-weakening) Let (T1,T2,T3) € Cp be such that 
TELS EA 


Then 
ri, Dz, r3 Hb > b. 


w 


The implications from left to right of the following lemma are a partic- 
ular case of Lemma 6.24. 

The implications from right to left allow to make the context shorter. 
The first two parts state that declarations of the form c(A):A and z:A in 
a context do not have any influence on the reduction relation gs. The 
last part states that declarations of the form c(A)=a:A in a context do not 
have any influence on the — gg reduction behaviour of terms b € Tp with 
c ¢ CONS (b). This allows to remove definition declarations, as rule (62) of 
the definition of ö-reduction does for local definitions. 

The lemma is proved by induction on the definition of —g and —;. 


Lemma 6.25 


1. Let T1,2:A,T2) € Cp and be Tp. 
T1,T2 Hb gs b if and only if T1,2:A,T2 F b gs b'; 


2. Let (Tj,c(A):A,T2) E€ Cp and b € Tp. 
Di, I2 Hb gg if and only if Ty,c( A):A,P2 H bps 8; 


3. Let (Tı,c(A)=a:A,T2) E Cp and be Tp be such that c ¢ CONS (b). 
T1, I2 F b geb if and only if T1,c(A)=a:A,T2 F b —gs b'. 
& 


© Now we show that the compatibility rule for c(A)=a:A IN b when we 
reduce inside b is a derived rule (and therefore not included in the list of 
compatibility rules in Definition 6.16). 
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Lemma 6.26 The following rule is derivable from the ones in the defini- 


tion of —s: 
Db sb 


T E c(A)=a:A IN b >s c(A)=a:A IN # 


PROOF: Suppose [+ b —s b. By Lemma 6.24, T, c(A)=a:A H b —s b'. By 
definition of —s, it follows that TH c(A)=a:A IN b 35 c(A)=a:A IN b. R 


The following lemma is proved by induction on the structure of a. 


Lemma 6.27 (Substitution Lemma) Suppose x £ y and x € FV(d). 
Then 
alz:=b][y:=d] = a[y:=d][x:=b[y:=d]]. 


bX 


The following lemma shows that —g is substitutive. It is proved by 
induction on the generation of +g and by the Substitution Lemma. 


Lemma 6.28 (Substitutivity for -,) If a — a’ then alx:=b] —p 
a'|[z:=6]. R 


The relation —; is not substitutive. For example, let 


T = x:a,x':a, c()=x:a. 


We have 2 

TH? ¢():@ 
and 

Ph c() 6 X, 
but not 


TH c()[x:=x'] 5 zker!) 

The reason for this is to be found in the ö-weakening rule. When we 
introduce a new parametric definition c(A)=a:A, the term a may contain 
free variables that are not in the domain of A but in the domain of F. When 
unfolding the definition c, these new variables can appear, thus destroying 
substitutivity. 

However, we do have the following version of substitutivity. It is adapted 
so that the substitution now occurs in the context as well. The proof is by 
induction on the derivation of I Ha —s a’. 
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Lemma 6.29 (Weak substitutivity for —s) IfT Ha —s a! then 
T[z:=b] + alx:=b] 5 a’[x:=b). 


PROOF: Induction on the derivation of T F a —; a’. We only consider the 
two most interesting cases: 


e Ti, c(A)=d:A, T2 H c(bı, Bente bn) Dé diz::=b;]}_,- (61) 
Now 


(T1, c(A)=d:A, To) [z:=b) = 
(Tıle:=b], c(Alz:=b])=d[x:=b}:A[x:=b], T2 [£:=0]} 


so 


(Tı,c(A)=d:A,Ta)[x:=b] H 
c(bılx:=b],...,bn[2:=b]) 5 dlx:=bl[x;:=b;[r:=b]]}_., 


and as the x; are bound in (Tı,c(A)=d:A,T5), x; ¢ Fv(b) by the 
variable convention, so by the Substitution Lemma 


d|x:=b]|x;::=b:[x:=b]];-; ee) >; 


c ¢ CONS(a) and T F c(A)=d:A IN a —5 a (62). We have that c is 
bound in c(A)=d:A IN a, so by the variable convention c ¢ CONS (b), 
so c ¢ CONS (alz:=b|). Hence 


T[z:=b] + (c(A)=d:A IN a)[x:=6] >; alx:=b]. 


In the following lemma we reduce inside the term b of alx:=b]. The 
proof is by induction on the structure of a. 


Lemma 6.30 [fT Hb —gs b then TF alz:=b| —gs afz:=b'|. & 
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6b2 Church-Rosser for 3; 


In this section we prove the Church-Rosser theorem for —g, —s and —g5. 
As for ordinary A-terms, we have: 


Theorem 6.31 (Church-Rosser theorem for ß-reduction) If a —g 
a, and a —*g a then there exists a term az such that ay —g a3 and 
az —»g az. R 


The proof is similar to the proof for A-terms without definitions and 
parameters. 


For a context T and a term b we define |b|p, which is, intuitively, b in 
which all definitions are unfolded. That is: both the local definitions inside 
b, and the global definitions given in I. The definition is by induction on 
the total number of symbols occurring in T and 8. 


Definition 6.32 For a € Tp and T € Cp we define a term jalp € Tp as 
follows: 


pr = 2 (forz EY); 
sr = s (forse 5); 


lalr, alee=lbilpl. ifT= 
(T1, c(A)=a:A,T9); 


|c(b1,.--,0n)lp = 
c({bilp,---s[Pnlp) if c € DEFCONS (I); 
labr = lalrlölr; 
|Av:A.alp = Az:|Alp-lalp sa; 
IIzr:ABj, = Iz:lAllBlr As 
|c(A)=a:A in bl, = [blr etaj=a:4 (where c is T-fresh). 


The following lemma shows that |b|y is independent from variable dec- 
larations z: and constant declarations c(A):A in I. The proof is by in- 
duction on the definition of |d[p, r, 


Lemma 6.33 


hd bl, r, = (olp, ears: 
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© bln. rs = lölr, c(Ay:A,ra: 


X 


By induction on the definition of |b]; one shows that |b|r does not 
contain any local definitions. 


Lemma 6.34 For allb € Tp andT € Cp, [bly has no subterms of the form 
(c(A)=a:A IN d). X 


The intuition on |b| suggests that all definitions of b are unfolded in 
|blr. However, there may be global definitions in T that have not been 
unfolded in [b|p. Take, for example, T = (c()=c’():*,c’()=c”():*). Then 
leQlp = le’Ola = «’O, but c’() is not in 6-normal form with respect to I. 
This is due to the fact that I is not a sound context (see 6.13). 

By induction on the definition of |b|,, we show that if T is sound, |d|p 
is equal to b with all the definitions in b and [ unfolded. It is no serious 
restriction to consider only sound contexts, as all contexts that appear in 
CD-PTSs are sound (Lemma 6.23). 


Lemma 6.35 Let T be a sound context such that DOM (b) C DOM (I). 
Then DOM (|b|r) C DOM (T) \ DEFcons (T). 


PROOF: Induction on the definition of |b]p. We treat the two most inter- 
esting cases (at (IH) we use the induction hypothesis): 


e b=c(b,...,6,) and I = (T1, c(A)=a:A,T). 


pom (lalr, ales=lbilr) 

(pom (lalr, a) \ {zis „zn }) U U pom (itp) 
i=1 

(IH) (Dom (T1, A) \ DEFcoNs (T})) \ {21,---;2n} 


U (DoM (T) \ DEFcons (T)) 
= DoM(T)\DEFcoNSs(T). 


DOM (|bl‚) 


IQ 


We can use the induction hypothesis at (IH) because T is sound, and 
therefore DOM (a) C DOM (T1, A); 
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e b=c(A)=a:A IN V. 


DOM (lb) = DOM (el) 
(IH) 


IN 


DOM (T, c(A)=a:A) \ DEFCONS (T, c(A)=a:A) 
DOM (T) \ DEFCONS (T). 


With the above we can show: 
Lemma 6.36 IfT is sound and DOM (d) C DOM (T), then 
Th d —»s |dlp. 


PROOF: Induction on the total number of symbols occurring in T and d. 
We treat the two most important cases: 


e T = (Ty), c(A)=a:A,T2) and d = c(by,..., bn). 
Notice that TH d 5 ala;:=6;]%_,. 
By induction, T1,A Fa —s |alp, a, so T F a —6 |alp, a (Lemma 
6.24), so by Lemma 6.29, 


Piei:=bi]ljo, F afe: =b]; 5 lalr, alibi] 


As the z; are bound in c(A)=a:A, they do not occur free in I, so 
T'[x;:=6,]_, =T. Therefore 


T F afz: =b] 5 jalp, a [£i =b] 
TO laln, alee=lbilr ln 
= le(b1,.--‚Öön)Ip; 
e d=c(A)=a:A IN b. By the induction hypothesis, 
L, c(A)=a:A H bs [br (A)=a:A 
hence 


TH c(A)=a:A IN bs c(A)=a:A IN [bly (ayaa: 
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Now DOM (d) C DOM(T), so DOM(b) C DOM (T,c(A)=a:A), so by 
Lemma 6.35, i 


DOM aa) C DOM (T, c(A)=a:A) \ DEFCONS (T, c(A)=a:A), 


so c ¢ CONS (llr eçazza:4)}> a 


T Fd 6 c(A)=a:A IN lölr,A)=a:A > [blr e(A)=a:4" 


‚BR 


Corollary 6.37 In any CD-PTS, the relation —s is weakly normalising, 
i.e. each legal term has a B-normal form. 


PROOF: By Lemma 6.34 and Lemma 6.35, |b|p is in 6-normal form; by 
Lemma 6.36, |b|, is a 6-normal form of b. & 


The mapping |—|_ also helps us to show that —gs is confluent (Theorem 
6.42). For the proof we use some lemmas: 


Lemma 6.38 Assume (T},T3) is sound and DOM (b) C DOM (T},T'3). 
Then ölr Tars = ölr rs: 


PROOF: Induction on the definition of ir, rs: We consider only a few 
non-trivial cases: 


eb= c(bı, = iid zh) and T3 = (T31, c(A)=a:A,T32). 


Ic(b1,---, On)ipy ra = laln nal =lbilr, rg] 


(IH,6.30) 
= lalr, rara, alzi lir ro ral 


= le(d1,... ‚Ön)In rars; 


e b =c(A)=a:A IN b. Notice that 
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|c(A)=a:A IN alr rs = blr, Pejc(A)=a:A 


We 


lölr, ra ra c(Aj=a:A 
= |e(A)=a:A IN bl, rors: 


& 


Lemma 6.39 Assume (T'y,I'2) is sound, and DOM (a) C DOM (T},T;), 
DOM (b) C DOM (T1) and x ¢ DOM (T1). Then 


lelr,,‚r,[2:=lölr,] = lalz:=b}|r, potes): 


PROOF: Induction on the definition of |a|p, p,- We treat only a few non- 
trivial cases: 


e a= c(bi, en ‚bn) and I» = Taı, c(A)=e':C, Pos. 


lalp, r‚lz :=[blp, | 


|e Ies, Taı, alti: =|b; Ir, polie lt: =|dlr,] 


le Ir toy, 22 lle, zelle, ra lele, 


ii 


(6.27) 


) 


IE: 


[ee Bll, ra =o afe: le biletle rasli 
je(bi -x bn) [z= Olly, rates: 


e a=c(A)=c:C IN d. 


IN 


lalr, ra [z:=[blp, | = dln, rame c®=!blr,] 

(IH) 
ldlz:=d]|r, rare e(Alz:=bjj=e'[2:=b:C e=) 
\(c(A)=e':C IN d)[x:=6] Ile, ra Ta[z:=b]" 


IE: 


ill 


& 


Lemma 6.40 IfT + d —s d,T is sound, and DOM (d) C DOM (I), then 
|d|. = |d| and DOM (d') C Dom (I). 
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PROOF: We prove the following two statements simultaneously by induc- 
tion on the definition of |d|r: 
e IFT Hd —>s d then [dir = |d|; 


We prove the two non-trivial cases: 


e P= (Ti, c(A)=a:A, T2), d=c(bi,...,b,), and d = afri =b] z]. 


dir = falp, „&:=lbilrliı 
= lalp, [2:=ldilrliı 
(038) lalp[rs:=[bilp ly 
ee) |afzi:=bilken lp 
= ll 


The several cases that have to be distinguished for T —4 I” are all 
easy to prove; 


e d = c(A)=a:A IN b. Use induction on TH ds d’. We treat two 


cases: 
2 (6.38) 
- d' = band c ¢ cons(b). Then [dlp = [Apgayeea = llr = 
ln; 
- d = c(A)=a:A IN b and T,c(A)=a:A F b >, b. Then |d|, = 
(IH) 
llr eajzaa = blr eajzaa = Id Ir- 


i as (IH) 5 
ET FE I’ then Idi — [blr (ayaa: = llr (ayaa: == dir: 
x) 


Lemma 6.41 [fT is sound, DOM (d) C DOM (T) and d —gd’, then 
ldir >s dr. X 


The proof is similar to the proof of Lemma 6.40. 
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Theorem 6.42 (Confluence for —gs) IfT is sound, T Ha —g6 by and 
T HĀ a —gs bz then there exists a term d such that T F by —gs d and 
Ph bz ged. 


PROOF: The proof is illustrated by the following diagram. 


6b3 Strong normalisation for —; 


In [40], van Daalen presents a proof (originally due to de Bruijn) of strong 
normalisation for a definition system that is at the basis of AUTOMATH. 
De Vrijer uses a similar technique to prove the finite developments theorem 
[119]. A similar technique to the one of de Vrijer is also used in [114] to 
prove strong normalisation for ö-reduction in D-PTSs. We extend these 
techniques to prove strong normalisation for ö-reduction in CD-PTSs. 


First we define the multiplicity M,(T,a) of a variable z in a term a, 
depending on a context T. 


Definition 6.43 Forze V,T € Cp and a € Tp we define a natural number 
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M,(T,a) by induction on the total number of symbols in T and a. 


= 
N 
rr 
23 
eS 

N 


1; 
( Oifzr#gz; 
M(T,s) = OifseS; 
M,((T1,A),a) + 
7 M,(P, bi) - max(1,M,,((T1,4),a)) 
M.(T,c(b1,---,0n)) = 4 iff = (T1,c(A)=a:A,T}); 


ra 
= 
I 


<1 Mz(T, b;) otherwise; 
M.([,c(A)=a:A inb) = M((T,A),a) +M((T, A), A) + 
‘ae M((T, Ai), B;) + 
M- ((T, c(A)=a: A), b); 
M,([,ab) = M,(T,a) +M,(T,b); 
M(T,IIx:A.a) = M((T,x:A),a) +M,(T, A); 
M,([,Av:A.a) = M;((F,x:A),a) + M-(T, A). 


Following the line of [119] one can prove the following lemma (using 
induction on the definitions of M_(-, —)): 


Lemma 6.44 


1. If T is sound, DOM (a) C DOM(T) and x ¢ Fv(a) UFv(T), then 
M,(T, a) = 0; 


2. If (Tı,T3) is sound and DOM (a) C DOM ((T1,T3)), then 


M,.((T1,T3),a) = Me( (T1, Ta, T3), a); 


3. If (T1,T2) is sound, DOM (b) C DOM ((T1,T2)), DOM (a) C DoM (Ti), 
x#zandx¢FVv(Ij), then 


M,((Tı, Tal[e:=a]), b[x:=a]) = M,.((Tı, T2), b) +M- (T1,a)-Ma( (T1, T2), b). 


The following lemma requires a somewhat more complicated proof than 
in [119], as contexts are involved in our situation. 
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Lemma 6.45 Let F be sound, DOM (a) C DOM (T). IfT t a —s b, then 
M, (T, a) > M.(T,b). 


PROOF: We simultaneously prove, using induction on the total number of 
symbols in T and a, the following two statements: 


1. If Ha —sb, then M,(T, a) >M,(T,b); 
2. ET 45 I’, then M,.(T,a) > M,(T’,a). 


The proof is straightforward, using the lemma above. & 


Next we define, for T € Cp and a € Tp, a natural number Lr (a) that 
decreases with each 6 reduction step. It is similar to the mappings defined 
in [119] (used to prove the finite developments theorem), in [40] and in [114] 
(used to prove strong normalisation of ö-reduction). This function L- (—) 
computes an upper bound for the length of the longest ö-reduction path 
from a term to its ö-normal form. 


Definition 6.46 For T € Cp and a € Tp we define Lr (a) by induction on 
the total number of symbols in T and a: 


OifreV; 

Oifse S; 
Lir, a) (a) + 
ei Lr (bs) - max(1, Ma; ((T1, A), @)) +1 
ifr = (Tı,c(A)=a:A,T5); 


Lr (z) 
Lr (s) 


Lr (c(b1,.--,8n)) 


zi Lr (bi) otherwise; 
Lr (c(A)=a:A inb) = La (a) + Lra (A) + 
Dei Kras (Bi) + 
Lr c{Aj=a:A) (b) +1; 
Lr (ab) Lr (a) + Lr (b); 
Lr (IIx:A.a) = Lyryz:A) (a) + Lr (A); 
Lr (Az:A.a) Lir z:A) (a) + Lr (A) 3 


N 


Similar properties as in Lemma 6.44 and Lemma 6.45 hold for L_ (—): 
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Lemma 6.47 
1. If (T1,T3) is sound, DOM (a) C DOM ((T1,T3)), then 


Liry Ts) (a) = Lr 02,03) (a); 


2. If (T1,T2) is sound, DOM (b) C DOM ((I1,T'2)), DOM (a) C DOM (Ti), 
and x £Fv(Tı), then 


Lr, ‚Talz:=a}) (dl2:=a]) = Lyp, ra) (b) + Lr, (a) Me (Pa, T2), b). 
The lemma above is used to prove the crucial property of L_ (—): 


Lemma 6.48 If T is sound, DOM (a) C DOM (T) and T Ha —4 b, then 
Lr (a) > Lr (b). 


PROOF: Similar to the proof of Lemma 6.45. X 


Theorem 6.49 (Strong Normalisation for 6) The reduction 6 (when 
restricted to sound contexts T and terms a with DOM (a) C DOM(L)) is 
strongly normalising, i.e. there are no infinite -reduction paths. 


Proor: This follows from lemma 6.48. X 


Without the restriction to sound contexts I and terms a with DOM (a) C 
DOM (T), we do not even have weak normalisation: take 


T = (c()=d():A, d()=c():A). 


The term c() does not have a [-normal form. 


6c Properties of legal terms 


The properties in this section are proved for all terms that are legal in a 
pure type system with parameters, i.e. for terms a for which there are A, 
T such that CHOP a: AorHEP A: a. The main property we prove is 
that strong normalisation is preserved by certain extensions. 

Many of the standard properties of PTSs in [5], [54] hold for CD-PTSs 
as well. In the same way as in [5], [54] we can prove the Substitution 
Lemma, Correctness of Types, Subject Reduction (for Gó-reduction) and 
Uniqueness of Types (for singly sorted CD-PTSs): 
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Theorem 6.50 Let © be a specification. The type system ACD) has the 
following properties: 

e Substitution Lemma; 

e Correctness of Types; 


© Subject Reduction (for gs). 


Moreover, if G is singly sorted then ACDe) has Uniqueness of Types. & 


The Generation Lemma is extended with two extra cases: 


Lemma 6.51 (Generation Lemma, extension) 


1. If T „CD c(bi,...,bn) : D then there exist s, A and A such that 
i= 


THD=3 Altı:=bili,, and T LED b: Bilzs-=b;ljzi- Besides we 
have one of these two possibilities: 
(a) Either T = (Ty,c(A):A,P2) and Tj, A HEP A: 5; 
(b) Or T = (Py, c(A)=a:A,P2) and Ti, ACP a: A; 
DE HED c(A)=a:A IN b: D then we have two possibilities: 
(a) Either T,c(A)=a:A LCD; B >D (c(A)=a:A IN B): s and 
PhD =p c(A)=a:A IN B; 
(b) Or T,c{A)=a:A Db: s and PE D =g 5. 
X 
In case 1(b) we do not necessarily have T4, A CD 4: s. For instance, in 
the CD-PTSs of the Barendregt Cube one can abbreviate terms a of type 
O, whilst O is not typable in these systems. 


Also Correctness of Contexts has some extra cases compared to usual 
PTSs. Recall that T is legal if there are b, B such that HCP b: B. 


Lemma 6.52 (Correctness of Contexts) 


1. YT,x:A,T’ is legal then there exists a sort s such that T LED A: EN 
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2. IfT,c(A):A,T’ ts legal then T, A HCD A: s; 
3. IfT,c(A)=a:A,I" is legal then T, A HCP a: A. 
& 


Again, in case 3 we do not necessarily have I’, ABCP A:s. 

Now we prove that ACP (6) is Bó-strongly normalising if a slightly larger 
PTS A(G!) is -strongly normalising. The proof follows the same ideas of 
[114] to prove that a PTS extended with definitions is Gó-strongly normal- 
ising. 

For legal terms a € Tp in a context I’, we define a lambda term |[a||r 
without definitions and without parameters. If a is typable in a CD-PTS 
ACP (G), then |lal|p will be typable in a PTS A(G!), where G’ is a so- 
called completion (see Definition 6.60) of the specification G. Moreover, 
we take care that if a —g a’, then |{a||p > la’||r (that is: |lal|p —g |la’||p 
and |lall, # |a||T). Together with strong normalisation of ó-reduction 
(Theorem 6.49), this guarantees that ACP(G) is Gó-strongly normalising 
whenever A(G’) is -strongly normalising. 

We suppose that V UC, the set of variables and constants that are used 
to define Tp, is included in the set of variables that is used to define 7, the 
set of terms used for the PTS A(G'). 

A still denotes a list of variables 1):B,,...,2%n:Bn and A; is an ab- 
breviation for x1:Bı,...,2%;-1:Bi-ı. AA.a denotes Af, z;:B;.a and [J A.A 
denotes [[j_, 7::Bi.A. 


Definition 6.53 For a € Tp and I € Cp we define ||all as follows: 


IEM = sifsE S; 
zp = zifreY; 
IA A.a)|Ir, [rip lln ll 
lere ee 
c|lbillp,---, {ldn||p otherwise; 
lablr = liallr Holle; 
lAz:A.bllp = Az: [All - lbllr As 
lIz:A.bllp = Hz: Ally- llbllr a; 


lc(A)=a:A IN blip 


(AIT A-Allr)- lllr eça)=a:4) IA A-allr: 
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_ The mapping ||_||_ is slightly different from the mapping |_| . This is 
because we want ||_|| to maintain Z-reductions. In a term c(A)=a:A IN b, 
there may be G-redexes in A, a or A. These redexes may be lost in 
|c(A)=a:A IN blp = |blp aaa: Due to the extra A-abstraction in the 
definition of \|c(A)=a:A IN blir, the possible G-redexes in A, a and A are 
maintained. 

The mapping ||-||_ is extended to contexts. 


Definition 6.54 For a context T € Cp we define ||I'|| as follows: 


IS|| = 2; 
IT,z:Al = IP, 2: []Allp; 
IT,c(A):All = ITI e (ITT 4-Allr); 
IP, c(A)=a:A]] = IPI], c (ITT A-Allp)- 
We have similar properties for ||-||_ as for ||: 


Lemma 6.55 If (Tı,T3) is sound and DOM (a) C DOM((I,,T3)), then 
lallr, rors = lall, r3. x 


The proof is similar to the proof of Lemma 6.38. 


Lemma 6.56 Assume (T1,T2) is sound, and DOM (a) C DOM ((Tı,T;)), 
DOM (b) C DoM (T1), and x g DoM (T1). Then 


lallr, r, [2:= llbllr,] = lele:=2] llr, rare) - 


The proof is similar to the proof of Lemma 6.39. 

We now show that |||) translates a -reduction into zero or more 
ß-reductions, and that it translates a -reduction into one or more ĝ- 
reductions. 


Lemma 6.57 LetT be sound, and assume DOM (a) C DOM (T). Ifa —s b 
then |alln —s Ilbllr- 


PROOF: Using induction on the definition of T F a —s b, we simultaneously 
prove: 
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1. If a —s b then |lall, —g llbllr; 
2. ET 6 I’ then lellr 28 llall- 


We only treat two non-trivial cases. 
e Ty, c(A)=a:A,T2 + c(bi,..., bn) 5 alzi:=b;]}_,. Observe: 
IA A-allp, bilir -+ Ibelin 
(2e Balra- era ) aen 


5 lallr,,a i= leili 


Il 


lle(b1, tee son) Ip 


(6.55) = 
= llallp, (t= lll), 
(6.55) 
= flally [z= Wille]; 
(6.56) 


lal&i:=bi];=.llr ; 


e [+ c(A)=a:A IN b 5 b because c ¢ cons (b). Then c € FV(lldllr)- 


Hence 
le(A)=a:A 1N bp = (àc: HIT A-Allp-WPllecayeaa) IA All 
>p llbllr «a)=a:A [c:= IA A.allp] 
= [lr 


x 


Lemma 6.58 Let T' be sound, and assume DOM (a) C DOM (T). Ifa —g b 
then |lallp >; llbllr- 


PROOF: The following two statements are proved simultaneously by induc- 


tion on the structure of a. 


1. If a >g b then |lallp —5 llèllr; 
2. ET 9 I’ then lallr — 8 llall- 


(IH 1) refers to the induction hypothesis on 1, (IH 2) to the induction hy- 
pothesis on 2. We do not treat all cases, and only prove the first statement. 
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e clb1,-:.,5n) >g clbı,- ee .;bn), where bj >g bi, 
and T = (Tı,c(A)=a:A,T}). We have: 
le(bi,-…,b)le = IA A-allr, löille‘-llönlır 
(IH 1) 


1 
5 |AA.allr, alir -e [65 [lr Werle 
= ||c(b1,.- 3 bi, zes billes 


e (Ax:p.g)r >g q[x:=r]. Observe: 


lazporlp = (àz: llipllr -lall =p) le 
B llallr zp [z:= Iiri] 
= ofer: 


e c(A)=a:A IN b >, c(A')=a:A IN b, where 
SAS Fi Bien Binst Ba 
— B; >p Bi. 
Write C; = B; if i # j and Cj = Bj. Furthermore, let Aj = 
21:C1,...,%;_1:C;-ı. Observe that 
lc(A)=a:A IN bl, 


= (Ae: (MTTA-Alr) -lllr ea)=a:4) 


(IA A.allr) 

(H 1) 

u: (àc : (|IprodA’.All,). blir ccayaa:a) 
(IIA Atal) 

52 (Ac: (IITA! Alle) -Mlle eana} 
(IIA A’.allr) 


= |\c(A’)=a:A in bl, 
DS 


The proof for the cases c(b1,...,6,) and c(A)=a:A IN b shows that this 
lemma does not hold if we use |_|_ instead of ||_|| . The proof for the case 
c(A)=a:A IN b shows the need to prove that ||all, —s |lal|p ET >, I’. 
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Definition 6.59 The specification G = (S, A, R) is called quasi full if for 
all sı, s2 € S there exists s3 € S such that (s1, 52,53) € R. 


Definition 6.60 A specification G’ = (S', A’, R') is a completion of G = 
(S, A, R) if 


1. SC S', AC A', and RC R; 
2. S' is quasi full; 
3. for all s € S there is a sort s’ € S’ such that (s,s) € A’. 


Theorem 6.61 Let G = (S, A, R) and ©' = (S', A', R’) be such that ©' 
is a completion of ©. IfT HEN a: A then 


IPI Fe llallr : Alr- 


PROOF: Induction on the derivation of T H&P a: A. The rules of normal 
PTSs do not cause any problem, and the proof for the rules for paramet- 
ric constants are simplifications of the proofs for the rules for parametric 
definitions. We therefore only focus on the rules for parametric definitions. 


e §-application: 
Tj,c(A)=a:A,T2 FCP b; : Bijej=b i} (i=1,...,n) 
Ty,c(A)=a:A,Tz HCP a: A (if n = 0) 
Ty, c(A)=a:A, Da HP e(b1,...,bn) : Ajeji =b] 


Write IT =Tı,c(A)=a:A,Ta. If n = 0 then we know by induction that 
IF Fe llelly : Alp 


6.55 
and we are done because {Ie(b1,.--,bn)lp = llallin, “=> lallr. 


Now assume n > 0. As we have a derivation of T1, c(Â)=a: A, T3 ED 
b1:Bı, we can use Correctness of Contexts to find a (shorter) derivation 
of fy, A HEP a:A. By the induction hypothesis, we have 


IF, All Fe lallr,,a: All, ar (1) 
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Moreover, we can use the induction hypothesis to find 
[Palle ITTA-Alle, „Pall ber bil: ||Bill 2 
We can use Correctness of Types for the PTS AG’ to find s € S’ with 
Pill Fe WITA-Ally, : s- 3) 


Using rule (A), (1) and (3) result in [Dll Fer IA A.allr, : ITT A-Allr, - 
By definition of ||]] A-Ailr, , this means 


IPI be IA Ally, : To, ze: |IBillr,,a Alle a: (4) 
By Lemma 6.56, || Bifej=b,},4|| = Balra; [e= lbs) Using 
(2) and the application rule, we can derive from (4) that: 
IPI He (IA A-allr, Malle Nalle) + (All a (23 sel) 
We are done because 
ichi, - -s bn)llr = IA A-allr, Ibili lonlir 


and 
Alle, a las= ball, = Alr a les ese 


= Aerona: 


e ö-weakening: en er 
FE orb O TAR Paed 
T e(A)=a AAD b:B 

By induction, |T, All Fe lallr a : Alla; 50 


IEI, zo: Walla, rs En: Bre An Fe llellr a : lAllp.a - 


(5) 
By Correctness of Contexts for AG’, there are s1,...,8, € S’ such 
that 
WEI, za: IBillr.a, rees Eil: |Bill A re IBillr.a, Si 
(6) 


By Correctness of Types for ACD , there are two possibilities: 
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— There is s € S such that A = s. As © is a completion of 6, there 
is s! € S' such that ||T, All Fe s: s’. 
— There is s’ € S such that F, A HČĎ A: s'. Then by induction, 
IT, All F IAllr,a ial. 
In any case: we can determine sg € S’ such that 


IP], £1: IBillr,a. very Tn: IBallr.a, Fe Allra ; So: (7) 


As ©’ is quasi-full, we can subsequently determine s{,...,s’, such that 
(si, 4_1,5,) € R' for i =1,...,n. This allows us to apply II-formation 
n times, with as premises (6) and (8), and as conclusion: 

IT Fer Mi ze: IBillr,a;- lllr a : Sa 


Notice that ]];-, ze: llBillr a; -Allra = IIIIA-Allr- As the induc- 
tion hypothesis gives us also ||T]| Fe |lbll, : ||B||r, we can use the 
weakening rule of AG’ to obtain 


Cle MTT A-Allp Fer [bly : Blir- 


We are done because ||bll, = |lbllr «ay=a:4 and (Bil = IBllr,ca)=a:A 
(Lemma 6.55); 


ö-formation: 
Fy,c(A)=a:A Fo? B:s 
Dr HCD c(A)=a:A IN B: 
Write T = [y,c(A)=a:A. By the induction hypothesis, we have 
Ill Fer ||Bllp : s, so 


Till, HIT A-Allp, Fer Bllp : s- (8) 
By Correctness of Contexts on (8) there is sj € S’ such that 
Wall Fe IIITA-Allr, : sı- (9) 


Moreover: As ©’ is a completion of 6, there is ss € S’ such that 
(s:s2) € A’. By the Start Lemma, 


alle IITA-Allr, Fer s : s2. (10) 
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As © is quasi-full, there is 53 € S’ such that (sı,s2,53) € R’. Hence 
we can apply II-formation: 


Iri Fe He: II] A.Allp, -$ : 83. (11) 
We can now apply A-formation on (8) and (11): 


ill Fer (Ac: [IT A-Allp, - IB Ir) : (He: ITT A-Allr, -s) - a 
12 


As we have a derivation of Tı,c(A)=a:A LCD B: s, we can apply 
Correctness of Contexts to find a (shorter) derivation of T1, A HC? 
a: A, so by induction: 


IP, All Fer llall, a : lAllr,,a- 
Using (9), we can repeatedly apply A-abstraction and obtain 
Pil] Fe IA Aall, : ITT A-Allr, - (13) 
Using (12) and the application rule, we find: 
all her (Ac: ITIA-Allr, - IB) IA A-allp, : s; 
e ö-introduction: 
T,c(A)=a:A FCP b:B Tı HCP ¢(A)=a:A IN B:s 
Di FED (c(A)=a:A IN b) : (c(A)=a:A IN B) 


In a similar way as in the previous case, we can find derivations of 
(13) and 


Pall Fer (Ac: IT A-Allp, - |lölle) : (He: I] A-Allp, - Blir) - = 
14 


Using (13), (14) and the application rule, we find 
Pill Her (Ac: ITI A-Allp, - Hell) IA A-allp, : Blip [e= lA Aal). 
By the induction hypothesis, 
ill Fer (Ac: TT A-Allr, - Billy) 1A A-allp, : s, 
so we can apply the conversion rule to find 


[Pil] Fe: lc(A)=a:A in blir, : le(A)=a:A in Blir, ; 
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e 6-conversion: 
r-êbb:B THÖÖB:s THB =5B' 
THED b: B l 
By induction, |T|| He: |löllr : |Bllp and IIC] He: |[B’||p : s. By Lemma 
6.57, ||Bllp =, |IB’IIr- By Conversion, ||P] Fey (bl, : IIB’Ir- 
& 


Theorem 6.62 Let G = (S, A, R) and ©' = (S', A’, R’) be such that 6’ 
is-a completion of ©. If the PTS MG!) is B-strongly normalising, then the 


CD-PTS ACD (G) is B6-strongly normalising. 


PROOF: Suppose that A(S’ ) is -strongly normalising, and suppose towards 


a contradiction that ACP (G) is not 8ô-strongly normalising, i.e. there is an 
infinite Bö-reduction sequence aj gs a2 —g6 ..., Starting at a = aj and 
THE? a: A. 

Observe that the number of -reductions in this sequence is infinite. Oth- 
erwise there would ben € N such that T F an —6 Gn41 5 Gn42...-, 
which contradicts the fact that ö-reduction is strongly normalising (Theo- 
rem 6.49). 


We conclude that the reduction sequence is of the form 
Tras; Qn, >p Ano —§ Ang >B Ang —6--- 


By lemmas 6.57 and 6.58 there is an infinite -reduction sequence starting 
at |lal|p: 


llallp >s Ilanı Ir > lanz|Ir >g llensllp 3 llenallr >s --- 
and by Theorem 6.61, |C} Fe ||lall : Allp, which contradicts the assump- 
tion that A(G’) is G-strongly normalising. X 
6d Restrictive use of parameters 


In the extension of PTSs to CD-PTSs presented in Sections 6a-6c, we did 
not put any serious restrictions on the use of parameters: 
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1. If G = (S,A,R) isa specification, then the introduction of a para- 
metric constant c in ACP (5) only requires that its intended type A 
is of type s, for some sort s € S. Similarly, for the introduction of 
a parametric definition we only require that its definiens a is of a 
certain type A. By correctness of types, either A = s, or A has type 
s, for some se S; 


2. Similarly, if T = [y,c(A)=a:A,T2, or T = Tı,c(A):A,Ta, the only 
restrictions we put on A are that A must contain only variable dec- 
larations, and that T4, ^A must be legal. There are no additional 
restrictions on the types B; of the declarations x;:B; in A. 


Something similar is the case with II-formation rules in a (parameter- 
free) PTS in which there is no restriction on the use of II-formation rules: 
(s1,82,53) € R for any s1,82,53 € S. In the specific situation that S = 
{(*,D)} and A = {x:D}, this would give the system AC, which is on top of 
the Barendregt Cube. The other systems of the Barendregt Cube cannot 
be constructed if we do not put restrictions on the rules that are allowed. 
It is the variation in the set of II-formation rules that makes it possible to 
distinguish the various type systems in the Cube (and the various logical 
systems that are behind it, via ‚the PAT-isomorphism). 

In this section we study CD-PTSs in which we put restrictions on the 
types of parametric constants and definitions, and their parameters. These 
restrictions can be described in a set P of parametric rules, just as the 
restrictions on II-formation rules is described in R. The effect of the rules 
in P is as follows. 


e Assume we have a constant declaration c(A) : A that is part of a legal 
context I. By Correctness of Contexts, A has type s for some s € S. 
Similarly, for each declaration z;:B; in A there is a sort s; such that 
B; has type s;. The use of parameters is restricted by demanding 
that (si,s) € P for i =1,...,n; 


e In principle, the same holds for a definition declaration c(A)=a:A. 
However, there is a small difference on this point. It is not necessary 
that A has type s for some sort s € S: it can be the case that 
A = s and that s : s’ does not hold for any s’ € S. This is a feature 
that occurs in the DPTSs of Severi and Poll. To keep our system 
compatible with the DPTSs, we want to maintain this feature. 
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To cover this case, we do not only introduce rules of the form (sj, 5), 
but also rules of the form (s;,TOP). If the use of parameters is re- 
stricted by a set P, then either (s;,s) € P fori =1,...,n, or Aisa 
topsort, and (si, TOP) € P fori=1,...,n 


In the specific case of the Barendregt Cube, the combination of R and 
P leads to a refinement of the Cube, thus making it possible to classify 
more type systems within one and the same framework. 

The similarity of restricting the use of parameters by a set P with re- 
stricting the use of II-formation by a set R gives us a theoretical motivation 
for the work in this section. But there are also some practical motivations, 
as several type systems can be described using restriction of parameters. 


Example 6.63 Consider the Pascal function double that was presented 
in the Introduction to this Chapter. 


e Remark that double only takes object variables as parameters. In 
Pascal, it is not possible to have functions with type variables as 
parameters; 


e Moreover, double returns an object. It is not possible in Pascal to 
construct functions that return a type as result. 
So the use of parameters is restricted to the object level. 


Other examples (ML, LF, AUTOMATH) are discussed in Section 6e. 


6d1 CD-PTSs with restricted parameters 


We now give a formal definition of pure type systems with restricted pa- 
rameters and restricted parametric definitions. 


Definition 6.64 (Parametric Specification) A parametric spectfica- 
tion is a quadruple (S, A, R, P) such that (S, A, R) is a specification (cf. 
Definition A.17), and PC S x (SU {TOP}). The parametric specification 
is called singly sorted if the specification (S, A, R) is singly sorted. 


The set P enables us to present a restricted version of the en -weakening 
rule of Definition 6.20. We call this rule restricted C- weakening (C- weak): 
THb:B TA: Bi: TAR A:s 


= s,s)eP 
T,c(A):AtCb:B (si 8) 
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The condition (s;,s) € P must holds for all ¿ € {1,...,n}. However, it is 
not necessary that all the s; are equal: in one application of rule (C-weak) 
it is possible to rely on more than one element of P. 


Definition 6.65 The typing relation HC is the smallest relation on Cp x 
Tp x Tp closed under the rules in Definition A.20, (C-app) (see Definition 
6.20), and (C-weak). 


Similarly, we present a restricted version of the ö-weakening rule of 
Definition 6.21. We call this rule restricted D-weakening (D-weak): 


THdb:B T,AiHÊ Biisi T,AtPa:A:s 
T,c(A)=a:AHPb:B 
Again, (s;,s) € P must hold for all 2 € {1,...,n}, and again it is not 
necessary that all the s; are equal. 
For the case that A is a topsort, we present a special version of this 
rule. By A: TOP we denote that A = s and that there is no s’ € S such 
that (s:s') € A. 


(si, s) EP 


THÔb:B DAP Biisi T,At? a: A: ror 
[,c(A)=a:Ab? b: B 
For all ¿ € {1,... n}, (si, TOP) € P must hold, but the s; may, again, be 
different. 


(si, TOP) € P 


Definition 6.66 The typing relation HD is the smallest relation on Cp x 
Tp x Tp closed under the rules in Definition A.20, (D-app), both versions 
of (D-weak), (D-form), (D-intro), and (D-conv) (see Definition 6.21). 


Definition 6.67 The typing relation +P is obtained from the relation 
ECD by replacing rule (C-weak) by rule (Ĉ-weak) and rule (D-weak) by 
rules (D-weak). 


Definition 6.68 (Pure Type Systems with Restricted Parameters 
and Restricted Parametric Definitions) Let G be a parametric spec- 
ification. The pure type system with restricted parameters and restricted 
parametric definitions (CD-PTS) and parametric specification G is denoted 
ACP(G). The system consists of the set of terms Tp, the set of contexts 
Cp, ß-reduction, 6-reduction, and the typing relation HED, 
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We do not extensively discuss the various meta-properties of CD-PTSs. 
This is because a CD-PTS with parametric specification (S, A, R, P) is a 
subsystem of the CD-PTS with specification (S, A, R). We only give a 
stronger formulation of the extension of the Generation Lemma 6.51: 


Lemma 6.69 (Generation Lemma, second extension) 

IFTHED c(by,... bn): D then there exist s, A and A such that TH 
D =ge Alzi:=b;}f,, and Hb: Bilsz=bj];l- Besides we have one of 
these three possibilities: 


1. Either we have that T = (Tı,c(A):A,T2) and Ty, HD A: s, and 
for each i there is s; with (s;,s) E€ P andT,A; HOD B;: si; 


2. Or we have that T = (Tj, c(A)=a:AandT’s), Tı, A HCD ag: A: s, and 


for each i there is s; with (s;,s) € P andT,A; HCP B,: Si; 


3. Or we have that T = (T1, c(A)=a:AandTs), Ty, A FDG: A: TOP, 
and for each i there is s; with (si, TOP) € P andT,A;, HCP B; : si. 


An important observation is the following one. 


Remark 6.70 Our systems with restricted parameters cover the PTSs 
with Definitions (D-PTSs) that were introduced by Severi and Poll in [114]. 
Let G = (S, A, R) a specification, and observe the parametric specification 
©' = (S, A, R, Ø). The fact that the set of parametric rules is empty does 
not exclude the existence of definitions: it is still possible to apply the rules 
D-weak for n = 0. In that case, we obtain only definitions without param- 
eters, and the rules of the parametric system reduces precisely to the rules 
of a D-PTS with specification G.? 


For the comparison of CD-PTSs with other PTSs, we introduce some 
terminology. 

In the introduction to this Chapter, we argued that a parameter mecha- 
nism can be seen as a system for abstraction and application that is weaker 


?The parametric system with specification G' has a C-weakening rule while the sys- 
tems of Severi and Poll do not. But the C-weakening rule can only be used for n = 0, 
and in that case C-weakening can be imitated by the normal weakening rule of PTSs: a 
parametric constant with zero parameters is in fact a parameter-free constant, and for 
such a constant one can use a variable as well. 
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than the A-calculus mechanism. We will make this precise by proving 
(in Theorem 6.79) that a D-PTS with specification (S, A, R) is as pow- 
erful as any CD-PTS with parametric specification (S, A, R, P) for which 
(s1,s2) € P implies (s1, 52,52) € R. We call such a CD-PTS parametrically 
conservative: 


Definition 6.71 Let G = (S, A, R, P) be a parametric specification. G 
is parametrically conservative if for all sı,s2 € S, (51,52) € P implies 
(s1,52,52) € R. 


Each CD-PTS can be extended to a parametrically conservative one by 
taking its parametric closure: 


Definition 6.72 Let G = (S, A, R, P) be a parametric specification. We 
define CL(6), the parametric closure of G, by (S, A, R', P), where R' = 
RU {(81, 2, 82) | (s1,52) € P}. 


The Lemma below follows immediately from the definitions above. 


Lemma 6.73 Let G be a parametric specification. 
1. CL(G) is parametrically conservative; 
2. CL(CL(G)) = CL(6). 


X 


6d2 Imitating parameters by X-abstractions 


Let G = (S, A, R, P) be a parametric specification. If G is parametrically 
conservative, then each parametric rule (sj, 52) of G has an equivalent II- 
formation rule (sı, 2,82). In this section we show that this II-formation 
rule can indeed take over the role of the parametric rule (s1,52). This 
means that G has the same “power” (see Theorem 6.79) as (S, A, R, Ø). 
With Remark 6.70 in mind, this even means that G has the same power as 
the D-PTS with specification (S, A, R). 

In order to compare G = (S, A, R, P) with 6 = (S, A, R, Ø), we need 
to remove the parameters from the syntax of \°?(G). This is easy: 
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e The parametric application in a term c(b),...,b,) is replaced by func- 
tion application cb; --- bn; 


e A local parametric definition is translated by a parameter-free local 
definition, and the parameters are replaced by A-abstractions; 


e A global parametric definition is translated by a parameter-free global 
definition, and the parameters are replaced by A-abstractions. 


This leads to the following definitions: 


Definition 6.74 We define the parameter-free translation {t} of a term 
t € Tp as follows: 


{x} = z; 
{s} = s; 
telbi,...,ön)} = c{bi} {bn}; 
{ab} = {a} {b}; 
{Ax:A.b} Az: {A}. {b}; 
{Tz:A.B} Iz: {A}. {B}; 
{c(A)=a:A IN b} c)={AA.a}:{]JJA.A} IN {b}. 


Definition 6.75 We extend the definition of {_} to contexts: 


Ili 


0) = 
{T,2:A} = {T},2:{A}; 
{T,c(A):A} = {T},c(): {JJA.A}; 


{T,c(A)=a:A} 


1T}, c= {A A.a} : {T1 A.A}. 


To demonstrate the behaviour of {_} under Pö-reduction, we need a 
lemma that shows how to manipulate with substitutions and {_}. The 
proof is straightforward, using induction on the structure of a. 


Lemma 6.76 For a,b € Tp: {a[z:=b]} = {a} [x:={b}]. B 


The mapping {-} maintains G-reduction. A 6-reduction is translated 
into a 6-reduction followed by zero or more G-reductions. These G-reduc- 
tions take over the n substitutions that are needed in a 6-reduction 


c(bi, veg bn) 76 alz;:=b;];_,- 
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Lemma 6.77 
1. Ifa—pa' then {a} >} {a}; 
2. IFT Ha —s a’ then there is a” such that {T} + {a} +f a” —g {a'}; 
3. If ha gs a! then {T} {a} gs {a’}. 


Proor: (1) follows easily by induction on the structure of a, and Lemma 
6.76. (3) follows from (1) and (2). We only show (2), using induction on 
the definition of I H- a —s a’, and treating only the most important case. 
Assume [ = Ty, c(A)=a:A,T2 and T F e(b1,...,bn) >s alz;:=bi];_,. Ob- 
serve that {T} = {T1}, c= {A Aa}: {]] A.A}, {Ta}, so 


{T} el) {61} --- {on} —s {AA.a}{bi}--- {bn} 
—>g {a} [z= {bi}; 
I eehlt,). 


K 


Remark 6.78 In 6.77.1, we cannot replace —} by —g. This has to do 
with the definition of {c(A)=a:A IN b}. One -reduction in A gives rise to 
(at least) two g-reductions in c()={AA.a}:{]] A.A} IN {b}. 

Similarly, we cannot replace the > in 6.77.2 by —s. 


Now we show that {-} embeds the CD-PTS with parametric speci- 
fication G = (S, A, R, P) in the CD-PTS with parametric specification 
6 = (S, A, R,@), provided that G is parametrically conservative. 


Theorem 6.79 Let G = (S,A,R,P) be a parametric specification. As- 
sume © is parametrically conservative. Let G' = (S,A,R,@). Then 


TH@ a: A= {T} ESP fa}: {A}. 


PROOF: Induction on the derivation of T HÊ a: A. With the help of 
Lemma 6.76 and Lemma 6.77.3, all cases are straightforward except for the 


274 6 Pure Type Systems with Parameters 


(C-weak) and (D-weak) rules. We only treat the (D-weak) rule; the proof 
for (C-weak) is similar. So: assume the last step of the derivation was 
>D q. CDs, ĈÔ 3. 4A. 
PEE b: B T, A; He Bi: si Dar PN ER 
T, c(A)=a:A Fe? b: B 


By the induction hypothesis, we have: 
{T} H&P {b}: {B}; ( 
{T,A;} EER {B;} : sä; ( 
{T,A} HEP {a}: {A}; (17) 
{T,A} +SP {A}: s. ( 
G is parametrically conservative, so (s;,s,s) € R fori =1,...,n. There- 
fore, we can repeatedly use the II-formation rule, starting with (18) and 
(16), obtaining 
{T} HEP TI ae: {Bi}. {4}: s. (19) 
Notice: [Ji z: {Bi}.{A} = {IJ A-A}. Repeatedly using A-formation, 
using (17) and (19), results in 
{T} KEP ‚Az: {Bi} {a} : (I A-4A}. (20) 


Similarly, Af; z;: {Bi}. {a} = {A A.a}. Using (D-weak) (for the specifica- 
tion G’) on (15), (16), (19) and (20) results in 


{T}, c()= {à A.a} :{TJA.A} HE? {b} : {B}. 
x 


Remark 6.80 The results in Section 6d2 were presented for CD-PTSs. 
The same result, however, can be obtained for Ĉ-PTSs, that is: for PTSs 
with restricted parameters, but without definitions. We can also give an 
alternative formulation of Remark 6.70, stating that a C-PTS with specifi- 
cation (S, A, R, Ø) is in fact nothing more than a C-PTS with specification 
(S, A, R). 
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6d3 Refined Barendregt Cubes 


Theorem 6.79 has important consequences. The mapping {-} is fairly sim- 
ple. It only translates some parametric abstractions and applications into 
A-calculus style abstractions and applications. Hence a CD-PTS with para- 
metric specification G = (S, A, R, Ø) can be extended with any set of para- 
metric rules without extending its logical power, as long as the parametric 
specification obtained remains parametrically conservative. 

-In this section, we will apply the insight obtained in Section 6d2 to a 
concrete situation: the Barendregt Cube. The Barendregt Cube (Figure 13 
on page 300) is a three-dimensional presentation of eight well-known PTSs. 
All systems have sorts S = {x,D}, and axioms A = {(*,0)}. Moreover, 
all the systems have rule (*,*,*). System A— has no extra rules, but the 
other seven systems all have one or more of the rules (*, 0, 0), (D, x, *) and 
(0,0,0): 


e Going to the right in the cube means adding rule («, 0, 0); 
e Going upwards in the cube means adding rule (O, +, *); 
e Going backward in the cube means adding rule (0, 0O, 0). 


Thus, going to the right, going upwards and going backward means going 
to a stronger type system. 
The systems depicted in Figure 13 have the following I-formation rules: 


A (*, *, *) 


A2 2 (#, #, *) (0, *, *) 


AP (#, *, *) (*, 0,0) 
Aw  (*,*,*) (0,0, 0) 
AP2 (*,*,*) (O,*,*) (+*,0,0) 
Aw = (*, *,*) (O,x,x) (0,0, 0) 
APw (x, *, *) (#,0,0) (0,0,0) 


AC (*,*,*) (O,*,*) (*,0,0) (0,0,0) 


This cube can be constructed not only for PTSs, but also for C-PTSs, 
D-PTSs, C-PTSs, D-PTSs, and their combinations (see Figure 10 on page 
235). 7 
With Theorem 6.79, we can place certain CD-PTSs in the cube of D- 
PTSs (and, with Remark 6.80 in mind, certain C-PTSs can be placed in 
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the cube of C-PTSs). Let us, for example, have a look at the following 
parametric specifications: 


(S, A, {(*, *, *), (*, 0, O)}, Ø); 

(S, A, {(*, *, *), (x, D, D)}, {(*, *)}); 

(S, A, {(, *, *), (*, 0, 0)}, {(+, D)}; 

(S, A, {(*, *, *), (+, D,D)}, {(*, *), (+, O)}). 


where S = {*,O} and A = {(*,D)}. According to Theorem 6.79, the CD- 
PTSs with the above specifications are all equal in power, and according to 
Remark 6.70, they are all equal in power to the D-PTS with the specification 
of AP. 

Now look at the parametric specification 


G= (S, A, {(*, *, *)}, {(*, *), (*, D)}). 


The C-PTS AC(G) is clearly stronger than the PTS A—, as in \C(G) it 
is possible (in a restricted way) to talk about predicates. For instance, we 
can have the following context: 


a oa, 
eq(x:a,y:a) : x, 
refl(x:a) : eq(x,x),. 
aynm(x:0,7:0,pieq(x,y)) | ea(y, 3) 
trans(x:a, y:&, z:&, p:eq(x, y), q:eq(y,z)) : eq(x,z) 


This context introduces an equality predicate eq on objects of type a, and 
axioms refl, symm, trans for the reflexivity, symmetry and transitivity of 
eq. It is not possible to introduce such a predicate eq in the PTS A— 
without any parameter mechanism. On the other hand, \°(G) is weaker 
than the PTS AP: in AP we can construct the type IIx:a.Ily:a.*, which 
allows us to introduce variables eq of type IIx:a.IIy:a.*. This makes it 
possible to speak about any binary predicate, instead of one fixed predicate 
eq. It also gives us the possibility to speak about the term eq without the 
need to apply two terms of type a to it (cf. the “philosophical argument” 
in the introduction to this Chapter). 

Altogether, this puts the C-PTS AC(G) clearly in between the PTSs 
A and AP. Similarly, the CD-PTS ACP (G) is in between the D-PTSs 
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A— and AP. We can illustrate this in the Barendregt Cube by putting the 
specification G in the middle of the edge that connects the systems A— 
and AP. 

This idea can be generalised to obtain a refinement of the Barendregt 
Cube. We start with the system A. Adding an extra Il-formation rule 
(s1,82,32) to A— corresponds to moving in one dimension (to the right, 
upward, or backward) in the Cube. We add the possibility of moving in 
one dimension in the Cube, but stopping half-way the Cube, and we let 
this movement correspond to extending the system with the parameter 
rule (s1,s2). This “going only half-way” is in line with Theorem 6.79, 
which says that II-formation rule (s1, 82,52) can mimic the parameter rule 
(s1,82). In other words, the system obtained by “going all the way” is at 
least as strong as the system obtained by “going only half-way”. 

The refinement of the Barendregt Cube is depicted in Figure 11. 


6e Systems in the refined Barendregt Cube 


In this section, we show that the Refined Barendregt Cube enables us to 
compare some well-known type systems with systems from the Barendregt 
Cube. In particular, we show that ML, LF, A68, and AQE can be seen as 
systems in the Refined Barendregt Cube. This is depicted in Figure 12 on 
page 283, and motivated in the four subsections below. 


6el ML 


In ML (see for instance [90]) one can define the polymorphic identity as 
follows (we use the notation of this Chapter. In ML, the types and the 
parameters are left implicit): 


Id(a:*) = (Ax:a.x) : (a > a). 


But it is not possible to make an explicit A-abstraction over a:*: the ex- 
pression 
Id = (Aa:#.Ax:a.x) : (Ila:x.a — a) 


cannot be constructed in ML, as the type Ila:*.a — a does not belong 
to the language of ML. Therefore, we can state that ML does not have a 
II--formation rule (0, *,*), but that it does have the parametric rule (D, *). 
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Figure 11: The refined Barendregt Cube 
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Similarly, one can introduce the type of lists together with some ele- 
mentary operations in ML as follows: 


List(a:#) : x; 
nil(a:*) : List(a); 


cons(a:*) : &— List(a) — List(a), 
but the expression Ila:*.* does not belong to ML, so introducing List by 
List : Tla:#.* 


is not possible in ML. We conclude that ML does not have a II-formation 
rule (O, O, O), but only the parametric rule (D,D). Together with the fact 
that ML has a II-formation rule (*, *,*), this places ML in the middle of 
the left side of the refined Barendregt Cube, exactly in between A— and 
Aw. 


6e2 LF 


Geuvers [54] initially describes the system LF (see [59]) as the PTS AP. 
However, the use of the II-formation rule (+,0,0) is quite restrictive in 
most applications of LF. Geuvers splits the A-formation rule in two rules: 


Oo) T,2:AFM:B THIe:AB:* 

TH Aoz:A.M : Ir: A.B i 
(Ap) T,z:AFrM:B T’H-lIe:AB:O 
TE Ap2z:A.M : IIe: A.B 


System LF without rule (Ap) is called LF”. G-reduction is split into ĝo- 
reduction and #p-reduction: 


(Aoz:A.M)N >s, Mlx:=N]; 
(Apz:A.M)N —gp Mlz:=N]. 


Geuvers then shows that 


e If M:+xor M:A:* in LF, then the @p-normal form of M contains 
no Ap; 


e IfT tye M : A, and T, M, A do not contain a Ap, then T Fip- M : A; 
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e ETHM:A(: +), all in Bp-normal form, then T H‚p- M : Al: x). 


This means that the only real need for a type IIz:A.B : O is to be able 
to declare a variable in it. The only point at which this is really done is 
where the bool-style implementation of PAT is made (see Section 4a4): the 
construction of the type of the operator Prf (in an unparameterised form) 
has to be made as follows: 


prop:* prop: * prop:*,a:propt *:0 
prop:* + (Ila:prop.*) : 0 


In the practical use of LF, this is the only point where the II-formation rule 
(*, 0,0) is used. No Ap-abstractions are used, either, and the term Prf is 
only used when it is applied to a term p:prop. This means that the practical 
use of LF would not be restricted if we introduced Prf in a parametric form, 
and replaced the II-formation rule (x, O, O) by a parameter rule (x,D). This 
puts (the practical applications of) LF in between the systems A— and AP 
in the Refined Barendregt Cube. 


6e3 A68 and AUT-68 


Looking back at the system AUT-68 of Section 5a and its A-calculus variant 
A68 that was constructed and discussed in Sections 5b-5c, we remark that 
AUT-68 has a parameter mechanism and a mechanism for global parametric 
definitions: 


e A line (T; k; PN; type) in a book is nothing more that the declaration 
of a parametric constant k(T):*, and a line (I; k; 51; type) is the 
declaration of a global parametric definition k(T)=%ı:+. There are 
no demands on the context T, and this means that for a declaration 
x:A €T we can have either A = type (in PTS-terminology: A = +, 
so A: O) or A:type (in PTS-terminology: A: x). We conclude that 
AUT-68 has the parameter rules (+,0) and (0,0); 


e Similarly, lines of the form (I; k; PN; X2) and (T; k; £1; £2), where 
SD2:type, represent parametric constants and global parametric defini- 
tions that are constructed using the parameter rules (x, *) and (D, x). 


Moreover, AUT-68 has a A-calculus mechanism with as only I-formation 
rule (x, *, *). 
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This suggests that AUT-68 can be represented by a CD-PTS with spec- 
ification 


Ges = (S,A, {(**,*)}, S x S) 


where $ = {*,Q} and A = {(*,Q)}. This system can be found in the exact 
middle of the refined Barendregt Cube. 

As for the structure of abstraction and application, this gives a good 
description of AUT-68. The position of AUT-68 in the Refined Barendregt 
Cube gives a far better idea of the force of AUT-68 than, for instance, the 
description of AUT-68 in [5], where it cannot be clearly positioned in the 
Barendregt Cube. Another advantage is that AXP (Geg) has parameters. 
Thus, it is closer to the original system AUT-68 than the system A68 that 
was described in Chapter 5, and in [5] (though in Theorem 5.62 and Remark 
5.63, we minutely described the way in which the parameter mechanism 
appears in A68). 

On the other hand, we should not say that AUT-68 is exactly the system 
ACP (Geg). There are several differences: 


e DPTSs have global and local definitions. AUTOMATH has only global 
definitions; 


e In DPTSs, the type B of a definition z=T':B does not have to be 
typable itself. In AUTOMATH, B has to be typable; 


e The D-reduction of DPTSs is not substitutive; 6-reduction of AU- 
TOMATH is substitutive; 


These differences can also be found between AUT-68 and the DPTSs of 
Severi and Poll (see Section 5d2). 


6e4 AQE and AUT-QE 


In AQE we have a II-formation rule (*,0,0) additionally to the rules of 
A68. This means that the applicational and abstractional behaviour can 
be described by the CD-PTS with II-formation rules (*,*,*) and (+, 0, 0), 
and parametric rules (sı,s2) for s1,s2 € S. This system is located in the 
middle of the right side of the Refined Barendregt Cube, exactly in between 
AC and AP. Again, this is not the exact representation of AUT-QE; there are 
differences that are similar to those described in Section 5d2. Moreover, 
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AUT-QE has a rule of type inclusion (see the Conclusion of Chapter 5), 
which is not taken into account in CD-PT'Ss. 


6e5 PAL 


The AUTOMATH languages are all based on two concepts: typed A-calculus 
and a parameter/definition mechanism. Both concepts can be isolated: it 
is possible to study A-calculus without a parameter /definition mechanism 
(for instance via the format of Pure Type Systems), but one can also isolate 
the parameter/definition mechanism from AUTOMATH. One then obtains 
a language that is called PAL, the “Primitive AUTOMATH Language”. It 
cannot be described within the Refined Barendregt Cube (as all the sys- 
tems in that cube have at least some basic A-calculus in it), but it can be 
described as a CD-PTS with the following parametric specification: 


S = {»,D} 
A = {(*,)} 
R= @ 


P 


{(+, 2), 6 D), (D, *), (D,D)} 


This parametric specification corresponds to the parametric specifications 
that were given for the AUTOMATH systems above, from which the II- 
formation rules are removed. 


6f First-order predicate logic 


A standard way to code first-order predicate logic in PAT-style (Curry- 
Howard variant) uses a type system that looks familiar to AP. It is due to 
Berardi, and presented in Definition 5.4.5 of [5]. 

To keep objects and object types separated from proofs and proposi- 
tions, the sorts * and O of AP are replaced by *s,*p,*f, Os and Op. Here, 
*s and Os handle the objects and object types, whilst *,, Op are used for 
propositions and their proofs. The sort xp is used to store the types of the 
function symbols of the first-order language. For the construction of logical 
implication and universal quantification, the II-formation rules (+p, *p, *p) 
and (*s,*p,*p) are used. The II-formation rule (*s,*s,*f) allows the for- 
mation of a function space between object types, and the II-formation rule 
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Figure 12: LF, ML, A68, and AQE in the refined Barendregt Cube 


(+5, *f,*/) makes it possible to form functions of several arguments between 
object types. There is no sort Of, as free variables for function spaces are 
not allowed. The construction of relation symbols requires H-formation 
rule (*s, Op, Op). 

Thus, we find a PTS (or a D-PTS) with the following specification: 


S = Fr, Os, Op}; 
A 1, Os), (*p, Eu); 
R {(*s, *s, *r), (*s, *f, *r), (*s, ¥p, *p), (*p, ps tp), (*s, Op, O,)}. 


II 


Due to the II-formation rule (*;,0,,0,) in the PTS-representation of 
first-order logic, there are types that are not in d-normal form: 


Example 6.81 For a term A: x, we can form the type IIx:A.*,. If b is 
a term of type *, in which a variable z:A may occur free, we can form 
Ar: A.b of type IIz:A.*,. Applying this term to a term a of type A results 
in (Az:A.b)a of type xp. This term is a type (because it has type *,) and is 
not in g-normal form. 
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If a PTS has types that are not in -normal form, it is possible that 
there are applications of the conversion rule 


TH A:B TrB':s B =z B 
TRA: B 


in a deduction in such a PTS. The conversion rule has as a disadvantage 
that its implementation in computer systems makes the system slow. This is 
because it may be very time-consuming (or memory-consuming) to establish 
whether two A-terms are f-equal or not. Hence, it would be useful to have 
a type system in which all types are in ß-normal form. 

In the formulation of first-order predicate logic above, it is only the rule 
(*s, Öp, Op) which allows to form types that are not in -normal form. We 
show this as follows. Assume, TH P : s, P is not in P-normal form, and 
all the subterms P’ of P that are a type are in J-normal form. Then P | 
cannot be a sort or a variable. As P has type s, P cannot be of the form 
Aw: P,.P2, either. If P = IIx:P.P) then either P or P, are not in G-normal 
form. As Pı and P are both types, this does not occur. So P must be 
an application term PiP). By the Generation Lemma for PTSs, there is 
a type A and a sort s such that TH Pı : (Nx:A.s). By Correctness of 
Types, there is a sort s’ such that TH (IIx:A.s) : s’. By the Generation 
Lemma, there is (8), 52,8’) € R such that + A: sı andT,z:Al s: 53. 
This means that (s,s2) is an axiom, and therefore s € {O,,0,}. Hence, 
(81, $2, 83) = (*s, Op, Op). 

We conclude that implementations of first-order predicate logic in type 
theory would be more efficient if it were possible to avoid rule (*,, Op, Op). 
With the use of parameters, it is easy to avoid that rule. This is because 
rule (*s, Op, Op) is only necessary to type the relation symbols of the first- 
order language. And as relation symbols in a first-order language are always 
introduced with parameters, it is no restriction to introduce them in the 
type system in a parametrised way. This can be done with parameter-rule 
(*s, Op): if we want to introduce a n-ary relation symbol R with arguments 
of type U,,... „Un (where the Ujs are of type *,), we apply C-weakening 
(let A= zU], ses „En Un and A; = Ti:U1, a Bit Oi): 


Trb:B T, A; Ui: *; Lr, At xp: 0p 
l,R(A):*p Fb: B f 


This involves the use of the parameter-rule (+, Op). 
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Hence, replacing rule (*,,0,,0,) by parameter-rule (x,, Op) enables one 
to remove the conversion rule in the type-theoretic representation of first- 
order predicate logic, making it more efficient (see the forthcoming Theorem 
6.84). It is reasonable to replace more rules by parameter-rules in the case 
of first-order predicate logic, as we presently explain. 

Function symbols in a first-order language are also of a parametric na- 
ture. The sort *y, the [-formation rules (*,,*s,*,) and (#,,*f,*f) are only 
used to construct the types of these function symbols. We can introduce 
these function symbols in a more realistic way by using parametric rule 
(*s,*s) instead of the II-formation rules (+, *s,*f) and (#5, */, ¥f): 


Tro:B T, A; F Ui: *s T,AFU:, 
r,f(A):U Hb: B i 


We have now obtained a C-PTS with parametric specification G' = 
(S', A’, R', P’), where: 


S' = dee Os, Op}; 

A = {(*s, Os), (+p, Op)}; 

R = Ya, 
P' = {(*s,*s), asp) 


We now prove that types in this Ĉ-PTS are always in -normal form. 
For the proòf we need as a lemma that any object term (that is: a term P 
such that there is Q with P : Q : x,) is in -normal form. 


Lemma 6.82 IfT KS, P:Q:*s then P is in B-normal form. 


PROOF: Induction on the structure of P. 


e The cases P € V and Pe S' are trivial; 

e If P = c(bj,...,bn) then we use the second extension of the Gen- 
eration Lemma, 6.69, and determine B),...,Bn,B and Sr. ,8n,8 
such that T HS, 6;:By[zj:=b,)*—, and T,21:Bı,...-1:Bi-ı Hé Bi:si, 
and (s;,s) € P’. Due to the definition of P’, s; = *, for all i. 
By the Substitution Lemma, T FH Bile;:=5;;2, : *s, and therefore 
Tb b::B:[2;: =b; : *s. By the induction hypothesis, the b; are in 
B-normal form. Therefore, c(b1,... ,bn) is in 3-normal form; 
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e If P = P,P, then there are (Generation Lemma) Rı,Rz such that 


x 


r HS, P, : IIe:Rı.Ra, and Q =g Role:=P}|. By Correctness of Types 
there is s € S’ such that T HÊ, (IOx:Rı.Ra) : s. By the Gener- 
ation Lemma and the definition of R’, T,x:Rı Hé, Rz:*p. By the 
Substitution Lemma, T HE, Ro{x:=P2|:*p. Let Q’ be a common ĝ- 
reduct of Q and R2[r:=P2]. By Subject Reduction, T KE, Q': *, and 
r KS, Q' : *, which contradicts Unicity of Types. We conclude that 
the case P = P,P, does not occur; 

If P = Ax:P,.P2 then there are Ri, Rz such that Q =; Iz:Ry.R2. Let 
Q' be a common g-reduct of Q and IIx:Rı.Ra. There are Ri, R such 
that Q! = Ile:R{.Rh. By Subject Reduction, T HÊ, Ie:R{.RS : »,. 
By the Generation Lemma, there are s1, $2 such that (s1,52,*) € R. 
This is not the case. So the case P = Ax: P,.P> does not occur; 

If P = Ha:P,.P2 then there is s such that Q = s. By the Generation 
Lemma, this would mean that s : *, is an axiom, which is not the 
case. So the case P = IIz:P,.P) does not occur. 


Remark 6.83 The proof of this lemma not only shows that a P for which 
P : Q: x, is always in normal form. It also shows that P can only be 
a variable or an expression of the form c(bı,... ,5,) such that there are 


Bi: 


…,Bn with b; : B; : *,. This corresponds exactly to the definition of 


terms in first-order logic. We conclude that our specification G’ results in 
an exact description of the terms of first-order logic. 


Theorem 6.84 Assume T Hé, P:s. Then P is in B-normal form. 


PROOF: Induction on the structure of P. 


e The cases P € V and P € S' are trivial; 
e P = c(b1,... bn). By the second extension of the Generation Lemma 


6.69, there are sorts sı,... , Sn and terms Bı,... , Bn such that (s;, 8) € 
P', T ES bi: Blesse and T,rı:Bı,... ,2j-1:By-1 Pel Bi:si. By 
the definition of P’, s; = *, for all ¿ By the Substitution Lemma, 
r HG; bi: Bi|z; =b; :*s. By Lemma 6.82, the b; are in -normal 
form. Therefore c(b1,... ,bn) is in -normal form; 
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P= PiP. By the Generation Lemma, there are Rı, Ra such that 
T HE, P, : Ox:Rı.Ry and s =g Ry[z:=P2]. By Correctness of Types, 
the Generation Lemma and the definition of R’, T,x:Rı HE, Ra: 
*p. By the Substitution Lemma, T HS, Ralx:=P;] : +p. By Subject 
Reduction, T Fe $:%*p. This means that (s,*,) is an axiom, which is 
not the case. We conclude that the case P = P, P} does not occur; 

e P = x:P\.P). By the Generation Lemma, s =, IIx:Rı.Ry for some 
Ri, Ry. This is impossible. We conclude that the case P = Ar: Pı.Pı 
does not occur; 

e P=Mz:P,.P;. By the Generation Lemma, there are sj, $9 such that 
TES, Pı:sı andl, 2:P, HS, Po: s2. By the induction hypothesis, P4 
and P are in G-normal form. So P is in @-normal form. 


K 
We conclude that replacing the I-formation rules 
(*s,*s,*f) (st #r) (+s, Op, Oy) 


by parametric rules 
(#5, *s) (*5, Op) 


makes the implementations of first-order languages in type theory 
e easier to implement (as the conversion rule becomes superfluous); 


e more realistic (it gives, for example, an exact description of the terms 
in first-order logic, something that cannot be done in the parameter- 
free PTS proposed by Berardi). 


Conclusions: Yet another extension of PTSs? 


Since PTSs have been introduced, many extensions have been proposed 
(see [6] for a non-exhaustive list). The reader may wonder why yet another 
extension of PTSs is proposed in this Chapter, and whether it is more 
interesting than those other extensions or not. In this section we give an 
answer to these questions. 
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Practical motivation 


We gave already some reasons for the use of parameters at the beginning 
of this Chapter: 


e Our extension is compatible with (and can be seen as an extension 
of) the extension of PTSs with definitions as proposed by Poll and 
Severi, which is considered to be a standard way to introduce defini- 
tions in PTSs. In fact, allowing only parametric constants with zero 
parameters results in the D-PTSs of [114]; 


e Parameters and parametric definitions occur in many implementa- 
tions of type systems, and more general, in programming languages. 
The Pascal-function double that was introduced at the beginning of 
this Chapter can be described in our formalism by the context decla- 
ration 

double(z:Int)=z+z:Int; 


e The AUTOMATH systems, which form the basis for most modern proof 
checkers that are based on type theory, can be described in our sys- 
tem. The description given in Chapter 5 is precise, but it is not a 
description that looks natural. The separate abstractors { and § do 
their job as well as possible in a type system without parameters, but 
a description of AUTOMATH that includes parameters does more jus- 
tice to that system. Moreover, it places AUTOMATH in a more general 
framework, so that it can easily be compared with other type systems 
(see Figure 12 on page 283); 


e Modern type systems, like LF and ML, have already been described 
as one of the systems of the Barendregt Cube (Figure 13 on page 
300). But in Section 6e we showed that a more detailed description 
can be given in the refined Barendregt Cube of Figure 12; 


e As argued in Section 6f, parameters are useful when describing first- 
order logic in type theory. Compared to the traditional PTS-represen- 
tation (systems related to AP of the Barendregt Cube) of first-order 
logic, parametric representations are 


— easier to implement (because the conversion rule is not needed); 
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— closer to the original first-order language and therefore closer to 
the intuition; 


e As argued in the beginning of this Chapter, parameters make it pos- 
sible to distinguish the attitude of users and developers of a system. 
Often, the user only needs a (partially) parametrised version of the 
system, whilst the developer wants to have the possibilities of full 
A-abstractions. 


But “parameters give a better description of the type theory that is 
used in practice” is not the only argument in favour of the system of this 
Chapter. There is more than that. 


The heart of type theory 


In the Introduction we extensively discussed the notions of functionalisation 
and instantiation and declared them to be the heart of type theory. After 
our exploration of type theory throughout the present work, we still think 
they are, for more than one reason: 


e Functionalisation and instantiation stood at the cradle of type theory. 
The story of type theory began with Frege’s abstraction principles 
(instantiation was not explicitly defined, but definitely present in an 
implicit form), and the logical paradoxes that arose if one does not use 
these principles carefully. Type theory made a careful use of Frege’s 
principles possible; 


e An important application of modern type theory is logic. This is due 
to the PAT-principle, which on its turn is based on the interpretation 
of — and V as function types. And function types exist because of 
functionalisation and instantiation. 


The parameter mechanism shows us a new, different form of functionalisa- 
tion and instantiation and therefore makes the theory of functions richer 
and more interesting: 


e It gives us a better idea of the possibilities of the traditional forms of 
functionalisation and instantiation; 
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e It places these traditional forms in a broader perspective by showing 
that these forms are not the only possible forms of functionalisation 
and instantiation. 


In this light, the parameter mechanism is not only an eztension of 
Pure Type Systems (as depicted in Figure 10 on page 235), but also (and 
particularly) a refinement of this framework, resulting in refinements of 
parts of it, like the Barendregt Cube (Figure 11 on page 278). 


Future work 


There are several things concerning parametric type systems that deserve 
to be studied in the future: 


e The meta-theoretical properties may have easier proofs than the ones 
presented in this Chapter. In particular, the proof of strong normali- 
sation for a parametric type system is based on strong normalisation 
for a PTS that may have more II-formation rules. It would be in- 
teresting to know whether (and to what extent) these rather strong 
demands can be weakened; 


In the systems proposed in this chapter, it is not possible to have a 
parametric constant (or definition) that takes a parametric function as 
a parameter. For example: We want to formulate the property Ref (B) 
for binary relations B over type T, indicating that this relation is 
reflexive. In our current system, B cannot be a parametric function 
b(z:T,y:T), because b(x:T, y:T) is not a term. We must make the full 
A-abstraction Ax:T.Ay:T.b(z, y) (which is a term) if we want to give 
it as an argument to Ref. 


It may be useful to design a system in which the parametric function 
b(z:T,y:T) could be substituted for B without the need of making 
the A-abstractions. 


e There may be a relation between the parameter mechanism of this 
chapter and AUTOMATH, and the use of parameters in the represen- 
tation of higher order propositional functions in the ramified theory 
of types of Russell and Whitehead. 


Appendix A 


Preliminaries 


In this thesis we try to present various important type systems that were 
proposed during this century in a uniform framework. An important part 
of this framework is formed by the so-called Pure Type Systems (PTSs). 
Therefore, a short introduction to typed lambda calculus and PTSs is es- 
sential for the understanding of this thesis. 

Lambda calculus was introduced by Church [28, 29], as a formalisation 
of the notion of function. With this formal notation he could formulate 
his set of postulates for the foundation of logic. Kleene and Rosser [74] 
showed that Church’s set of postulates was inconsistent. The lambda cal- 
culus itself, however, appeared to be a very useful tool. In Chapter 2 of 
this thesis we showed that it is much more clear and accurate than the 
notion of (propositional) function as introduced by Russell and Whitehead 
in Principia Mathematica [121]. 

Nowadays, [4] is the standard work for (untyped) lambda calculus. We 
present the basic definitions and properties of the A-calculus in Section Aa. 

Being a suitable framework for the formalisation of functions, it is not 
surprising that lambda calculus appeared to be an excellent tool for for- 
malising the Simple Theory of Types [30]. In Section Ab we give a short 
description of Church’s formalisation. This formalisation is at the basis of 
most modern type theories and especially at the basis of PTSs. PTSs were 
introduced by Terlouw [118] and Berardi [13], providing a general frame- 
work in which many type systems can be described. Section Ac presents 
the definition of PTSs and Section Ad discusses the most important meta- 
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properties as described in [55], [5], and [54]. 


Aa Lambda calculus 


We give a description of typed lambda terms. This description follows the 
line of [5], as it mainly serves as a description of Pure Type Systems (PTSs). 


Definition A.1 Let V be a set of variables and C a set of constants. The 
set 7(V,C) (shorthand: 7, if it is clear which sets V and C are used) of 
typed lambda terms with variables from V and constants from C is defined 
by the following abstract syntax: 


T:=V|IC|TT|AV:T.T|IWV:TT. 
If x does not occur in B then IIx:A.B is sometimes denoted by A — B. 


We assume that V and C are countably infinite. We use = to denote 
syntactical equality between typed lambda terms. 

We use z,y,2,a@, as meta-variables over V. In examples, we some- 
times want to use some specific elements of V; we use typewriter-style to 
denote such specific elements. So: x is a specific element of V; while z is 
a meta-variable over V. The variables x, y, z are assumed to be distinct 


elements of Y (so x Æ y etc.), while meta-variables x,y,z,... may refer 
to variables in the object language that are syntactically equal. We use 
A,B,C,...,a,b,... as meta-variables over T. 


A term Ax:A.b has as intuitive interpretation the function that assigns 
b[x:=a] (the term b in which each occurrence of x has been replaced by a) 
to each a that belongs to (is an element of, has type) A. If b has type B, 
then Ar:A.b is a function from A to B. A — B should be interpreted as 
the type of functions from A to B. This means: Axr:A.b has type A — B. 


Example A.2 Ax:A.x is the identity function on A, and has type A — A. 


In some situations, we allow that the type B ofb depends on the variable 
x. In that case, b[z:=a} is of type B|x:=a]} for a of type A. Then Az:A.b is a 
function with domain A and range („4 B[x:=a], with the special property 
that the function value for a:A, b[x:=a], belongs to the subset B[x:=a] of 
U... Blz:=a]. The type of such functions will be represented by IIz:A.B. 
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Example A.3 The polymorphic identity Ay:*.Ax:y.x has as type 
Iy:x.y = y. 


We could have written Ily:x.IIx:y.y instead of Ily:*.y — y. Here * can be 
interpreted as the class of types. 


Remark A.4 The term Ay:type.Ax:y.x in Example A.3 is a function of two 
variables: y and x. The function is constructed by repeated A-abstraction. 
The first A-abstraction (over x) leads to a function of one variable: Ax:y.x, 
and another A-abstraction (over y) leads to the desired function of two 
variables. The use of repeated A-abstraction in order to represent functions 
of more than one variable is called “currying” after H. B. Curry, though 
currying was already discovered by Schönfinkel in 1924 [109], before Curry 
discovered it, and the basic ideas for currying can already be found in the 
works of Frege, which date from 1879 (see Section 1b1 of this thesis). 


The following notational conventions allow us to reduce the number of 
brackets in terms: 


Notation A.5 
e We write \Z:A4.B, or Naer Tits. A, as shorthand for 
Az Ai (Azz Aal: (Atn:An-A)--+)); 
e We use II#:A.B, or IIR_,2::A:.A, as shorthand for 
TIz}:A}.(IIx2:4Ao.(--- I r,:4An-A)---)); 
+ We write AB, --- B, as shorthand for 


(--((AB1)B2) +++ Bn). 


Definition A.6 For A € 7 we define Fv(A), the set of free variables of A, 
as follows: 


e FV(c) = © for cE C; 
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e Fv(z)= {x} for x € V; 


( 
FV(Aj A2) = FV(Aı) U FV(A2); 
e FV(Az:A1.Aa) = (FV(A1) \ {z}) U Fv( 42); 
e FV(Ilx:A1.Aa) = (FV(A1) \ {z}) U FV(AQ). 


Definition A.7 If Fv(a) = {zı,...,2„} and Aı,..., An is a list of terms 
then Af, 2i:A..a is a closure of a. 


Notice that there may be many different closures of one and the same term 
a. 

A subset of the set of A-terms that will be used in this thesis is the set 
of the so-called Al-terms: 


Definition A.8 Let V be a set of variables and C a set of constants. The 
set of Al-terms 77 over V and C is defined as follows: 


e Ifv € V then v € Tr; ifc € C then c € 77; 
e If A,B ET; then AB € Tī; 
e If A,b € Tr and x € Fv(b) then Az:A.b € Ty; 


e If A,B € T; then Ile:A.B € Ty. 


So within the set of Al-terms, a A-abstraction Az:4.b can only be made 
if the term 6 really depends on the variable x. This means that constant 
functions, and functions of more variables that are constant in one or more 
of their variables, are excluded from the set of Al-terms. 


Terms that are equal up to a change of bound variables are considered 
to be syntactically equal. This allows us to assume the so-called Barendregt 
Convention: 


Convention A.9 (Barendregt Convention) Bound variables will be 
chosen to be different from free variables. For instance, we write (Ay:A.y)x 
instead of (Ax: A.x)x. 
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Once this variable convention has been assumed we can define substi- 
tution in a straightforward manner (whereas the definition in [38] is more 
complicated, and a formal definition of substitution is completely absent in 
[28, 29] and [31]): 


Definition A.10 (Substitution) We define Alx:=B] by induction on the 
structure of A: 


B ify=z; 
e yle:=B] ={ i en 


e (A, A2)[2:=B] = Ai[z:=B)]A2[x:=B}; 

e (Ay:Ay.A2)[2:=B] = Ay: Ai [z:=B). A2 [z:=B); 

e (Ily:A,.A2)[z:=B] = Uy: Ar [z:= B). A2 [z:=B]. 

We use the abbreviation Alz;:=B;]}_,, to denote 
Altm:=Brm] +++ [&n:=Bn]- 


If m >n then A[z;:=B;]"_,, denotes A. We also use the notation A[z:=B] 
for Alzi:=Bil- 
On lambda terms we have the notion of G-reduction. 


Definition A.11 (gG-reduction) The relation —g is described by the 
contraction rule 
(Az:A1.A2)B —p Aale:=B] 


and the usual compatibility rules (so: if A —g A’ then AB —g A/B, 
BA —p BA’, Ar:A.B —g Ar: A'.B, Au:B.A —g Ar:B.A!, Ile: A.B —g 
IIx:A'.B and IIx:B.A >; IIz:B.A!). 

—g is the smallest reflexive and transitive relation that includes —,; 
=g is the smallest reflexive, symmetric and transitive relation that includes 
—gs. By A >; B we indicate that A —3 B, but A ÉB. 

A term that has no subterm of the form (Az: A1.A2)B is a term in £- 
normal form, or a normal form if no confusion arises. We write A = B 
if A —g B and B is in -normal form. Similarly, A a B if A—g B and 
B is in Z-normal form. 
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The most important property of —g is the so-called Church-Rosser 
property: 


Theorem A.12 If A —s Bı and A —g Ba then there is C such that 
Bı —>g C and By —g C. 


There are numerous proofs of this theorem in the literature. The most 
well-known is via the Strip Lemma (see [4], Chapter 11), another short 
and elegant proof is given by Tait and Martin-Löf [88], also described in 
Chapter 3 of [4]. 

In this thesis we see many variants on the basic lambda terms of Defi- 
nition A.1, for instance lambda terms with parameters in Chapter 6. It is 
easy to prove that these variants have the Church-Rosser property for — g 
as well. - 


Ab Simply typed à-calculus 


We give a definition of the simply typed A-calculus as introduced by Church 
[30] in 1940. 


Definition A.13 The types of A— are defined as follows: 
e . and o are types; 
e If & and £ are types, then so is a — £. 


We denote the set of simple types by T. 


t represents the type of individuals; o is the type of propositions. a — 8 


is the type of functions with domain a and range 6. We use a,ß,... as 
meta-variables over types. — associates to the right: a — 8 — y denotes 
a > ($ => 7). 


The terms of the original presentation of A— are a bit different from the 
presentation in [5]. We give some explanation after repeating the original 
definition. 


Definition A.14 The terms of A— are the following: 


e —, A, Va for each type a, and 7, for each type a, are terms; 
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+ A variable is a term; 
e If A, B are terms, then so is AB; 
e If Ais aterm, and ra variable, then Ar:a.Â is a term. 


Definition A.15 A contert in A— is a set {z1:@1,... En: } where the x; 
are distinct variables and the a; are types. 


Some terms are typable (legal) in A>, according to the following deriva- 
tion rules: 


Definition A.16 The judgement TH A: a holds if it can be derived using 
the following rules: 
eI H--:0—0; 
TFA: 0-0; 
DV: (a — 0) 0; 


TE 7a: (a > o) > a; 
el-r:aifxaeT; 
e ET,z:aH A: B then TH (Az:a.A) : a — p; 
eET-A:a—PßandlH- B:athenTt (AB): 6. 


We use Fà if we need to distinguish derivability in A— from derivabil- 
ity in other type systems. 

The simply typed A-calculus can be seen as a pure type system, and 
therefore has the properties of pure type systems, that can be found at the 
end of the following Section. To adapt the simply typed A-caleulus to a 
pure type system, some amendments are made: 


e The two basic types 4, o are replaced by an infinite set of type vari- 
ables; 


e The constants =, A, Va and 7a are not introduced in the PTS-presen- 
tation. 


These adaptions do not seriously affect the system and are only used to 
make A— fit in the PTS-framework. 
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Ac Pure type systems 


Pure Type Systems (PTSs) were introduced by Berardi [13] and Terlouw 
[118] as a general framework in which many current type systems can be 
described. The framework is a generalisation of the well-known Barendregt 
Cube. 

Though PTSs were not introduced before 1988, they were already im- 
plicitly present in Nederpelt’s thesis ([91], Chapter III, Definition 1.3) 
and many rules are highly influenced by rules of known type systems like 
Church’s Simple Theory of Types [30] and Automath (see 5.5.4. of [39], 
and Section 5a). 

The description below is based on [5]. 


Definition A.17 (Specification) A specification is a triple (S, A, R), 
such that SCC,ACSxSandRCSxSxS. The specification is 
called singly sorted if A is a (partial) function S — S, and R is a (partial) 
function S x S — S. S is called the set of sorts, A is the set of arioms, 
and R is the set of (II-formation) rules of the specification. 


Definition A.18 (Contexts) A contezt is a finite (possibly empty) list 
£1:A1,.-.,2n:An (shorthand: £:A) of variable declarations. {11,..., £n} is 
called the domain DOM (#4) of the context. The empty context is denoted 


(). 


We use T, A as meta-variables for contexts. 
Substitution can be extended to contexts: 


Definition A.19 We define ['[z:=A] by induction on the length of I: 


e ()[z:=A] = (); 
Oe A ify 
° (T ,y:B)[z:=A] = { Deel, y:B[e:=A] ifzF y 


Definition A.20 (Pure Type Systems) Let G = (S, A, R) be a spec- 
ification. The Pure Type System AG describes in which ways judgements 
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[te A: B(orTt A: B, if it is clear which G is used) can be derived. 
TE A: B states that A has type B in context T. 


(axiom) ()F si: 82 (81,$2)€A 
(start) moe x g boM (I) 
(weak) ee et x Dom (I) 
(m) THA:sı T,2:Ak B:s (s1, 82,83) € R 


T H (IIz:4.B) : s3 
T,x:Atb:B TH (IIx:A.B): s 


(A T E (Az:4.b) : (Ix:A.B) 
THF: (Iz:A.B) Tha:A 
(appl) TE Fa: Blad] 
TFA:B FEB ts B =, B' 
OM) TREE 


A context T is legal if there are A,B such that TH A: B. A term A is 
legal if there are T,B such that Tr A: BorTF B:A. 

An important class of examples of PTSs is formed by the eight PTSs 
of the so-called Barendregt Cube. These systems all have {+, O} as set of 
sorts, and *:0 as only axiom, but they differ on the II-formation rules that 
are allowed, depending on which triples are in R: 


A> («, *, *) 

A2 (*,*,*) (DO, +, +) 

AP (*, *, *) (*, 0,0) 

Aw (#, *, *) (0, 0,0) 
AP2 (x, *, *) (0, x, *) («, 0, 0) 

Aw (*,*,*) (DO, *, *) (0,0,0) 
APw (*, *, *) (+,0,0) (0,0,0) 
AC («,*,*) (O,#,*) (*,0,0) (0,0,0) 


The dependencies between these systems can be depicted in the Baren- 
dregt Cube (see Figure 13). 

The systems in the Cube are related to many other type systems. The 
overview below is taken from [5]. 
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simply typed A-calculus; [30], [4] 
(Appendix A), [65] (Chapter 14) 
second order typed -calculus; [56], 
[103] 

AUT-QE! [21] 

LF? [59] 
[84] 

POLYREC [102] 

Fw [56] 
Calculus of Constructions; [35] 


Another PTS that occurs in this thesis is the Extended Calculus of Con- 


1A more precise study of AUT-QE, respecting the parameter structure of AUT-QE, 
shows that AUT-QE can be positioned a little bit higher in the Cube: exactly inbetween 
AP and AC. See Chapter 6, especially Section 6e4. — footnote by the author. 

?In Chapter 6, Section 6e2, we show that the practical use of LF does not use the full 
power of AP. In the refinement of the Barendregt Cube presented there, we show that the 
use of LF in practice corresponds to a system that is inbetween \— and AP. — footnote 
by the author. 
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structions ECC (see [86]). This is a PTS with 


S = N 
A = {(n,n+1)|n eN}; 
R = {(m,0,0)|meEN}U {(m,n,7r)|0< mn <r}. 


This is indeed an extension of AC (write * for 0 and O for 1). 


Ad Metaproperties of PTSs 


Pure Type Systems have some important meta-properties, which we de- 
scribe below. The proofs can be found in [55] and [54]. Throughout 
this section, + denotes derivability in a PTS with a certain specification 
Gels AR). 


Lemma A.21 (Restricted Weakening) IfT + A: B, we may assume 
the derivation of IH A: B to contain only applications of the rule (weak) 


that are of the form 
Phu: B THC:s 


T,e:Chu:B 


where ve VUC. 


Lemma A.22 (Free Variable Lemma) Let T = x1:Aı,...,2n:An be 
legal, say TH B:C. Then 


1. The x; are distinct; 
2. FV(B),Fv(C) C Dom (T); 
3. PV(A;) C Iaı,...‚zi-ı} for tl Si Sn. 
Lemma A.23 (Start Lemma) Let T be legal. Then 
1. TE s1 : s2 for all (sı,32) € A; 
2. Ttx:A for all(z: A) ET. 


Lemma A.24 (Transitivity Lemma) LetT, A be contexts. Assume T 
is legal, TH x: A for all (x:A) € A, and AF B:C. Then TA B:C. 
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Lemma A.25 (Thinning Lemma) Jf A is legal, TC A and TE A: B, 
then AFA:B. 


Lemma A.26 (Substitution Lemma) /T,x:A At B:C andT F 
D:A then T, A[z:=D]+ Blz:=D] : Clz:=D]. 


Lemma A.27 (Generation Lemma) 


1. IfI H ¢:C foraceC then there iss € S such that C =p s and 
(cis) € A; 


2. If «:C then there iss € S and B =p C such that T F B:s and 
(z:B) €T; 


3. TH (IIe:A.B) : C then there is (s1, 52, 53) € R such that TH A: sı, 
T,2x:AF B: sq and C =g 83; 


4. If T & (Az:Ab) : C then there iss € S and B such that T + 
(Hx:A.B):s;T,x:Abb: B; and C =; (IIx:A.B); 


5. IT + Fa: C then there are A,B such that TH F : (IIz:A.B), 
THa: A and C =, Blx:=al. 


Lemma A.28 (Correctness of Types) fT H A: B then B = 5 or 
[+t B:s for somese S. 


Lemma A.29 (Subterm Lemma) If A is legal and B is a subterm of 
A, then B is legal. 


Lemma A.30 (Subject Reduction) IfT + A: B and A —g A’ then 
PRA’. B. 


Lemma A.31 (Strengthening Lemma) /fT,x:A,‚ At B:C andz ¢ 
FV(A)UFV(B)UFV(C), thenT, AK B:C. 


The proof of this lemma is due to Van Benthem Jutting [12]. 


Lemma A.32 (Unicity of Types) If G is singly sorted, TH A: Bı and 
TFA: Bo, then Bı =6 Bo. 
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Lemma A.33 (Strong Permutation Lemma) IfT‚z:A,y:B, AC: 
D and x ¢ FV(B), thenT,y:B,c:AAAFC:D. 


Definition A.34 (Topsort) A sort s is a topsort if there is no s’ € S such 
that (s,s) € A. 


Lemma A.35 (Topsort Lemma) If s is a topsort and T} A: s then A 
is not of the form Aj As or Ax: A1.An. 


Theorem A.36 (Strong Normalisation for ECC) Let A be a legal 
term in the Extended Calculus of Constructions. Then A is strongly nor- 
malsing. 


As the systems of the Barendregt Cube are subsystems of ECC, all legal 
terms in the systems of the Barendregt Cube are strongly normalising, too. 


Appendix B 


Type systems in this thesis 


Ba The Ramified Theory of Types 


Bal RTT 


Definition B.1 (Propositional functions, 2.3) We define a collection 
P of propositional functions (pfs), and for each element f of P we simulta- 
neously define the collection Fv(f) of free variables of f: 


1. If i1,- Tal) € AUY then Reest) EP. 
B $ def ç. B 
FV(R(i1,---,4a(R))) = lin. ia} N V; 


2. If f,ge Pthen fvgePand-feP. 
FV(f V g) E rv(f) UFV(g); rv(of) © Fv(f); 


3. If f € P and x € Fv(f) then Vz[f] € P. 
def 


Fv(Ve[f]) = Fv(f) \ te} 


4. Ifn € Nand ky,...,kn € AUVUP, then z(ky,...,kn) E€ P. 


FV(z(ki,---,kn)) Æ {z,k1,---)kn} NV. 


If n = 0 then we write z() in order to distinguish the pf z() from the 
variable z; 


5. All pfs can be constructed by using the construction-rules 1, 2, 3 and 
4 above. 
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Definition B.2 (Ramified types, 2.37) 
1. 0° is a ramified type; 


2. If t7',...,t%* are ramified types, and a € N, a > max(aj,...,@n), 
then (t7',...,t@r)* is a ramified type (if n = 0 then take a > 0); 


3. All ramified types can be constructed using the rules 1 and 2. 


If ¿° is a ramified type, then a is called the order of t*. 


Definition B.3 (Predicative types, 2.41) 
1. 0° is a predicative type; 
2. If tit, tnt” are predicative types, and a = 1+ max(ai,...,an) 
(take a = 0 if n = 0), then (tj*,.…,t&*)“ is a predicative type; 


3. All predicative types can be constructed using the rules 1 and 2 above. 


Definition B.4 (Contexts, 2.43) Let xı,...,2n € V be distinct vari- 
ables, and assume t{1,..., #4" are ramified types. Then {z1:1]',...,zn:t@r } 
is a contert. The set {x1,..., 2n} is called the domain of the context and 
is denoted by dom({z1:t]',...,an:tär}). 


Definition B.5 (Ramified Theory of Types: RTT, 2.45) The judge- 
ments Th f : t° is inductively defined as follows: 


1. (start) For all a: 
Fa:0°. 
For all atomic pfs f: 
Popes 
2. (connectives) Assume TH f:(é7,...,t2")*, AH gu... um)", 
and x < y for all x € dom(T) and y € dom(A). Then 


max(a,b) 


PURE Star ee A nena) 


Dh af: (Rt); 
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Amtl : 


. (abstraction from parameters) ETHf: (tP... tgr), tt} is 


Am-+1 


a predicative type, ge AUP is a parameter of f, T Hg: tn, and 
x < y for all x € dom(T), then 

DER ae era 
Here, h is a pf obtained by replacing all parameters g’ of f which 
are ar-equal to g by y. Moreover, I” is the subset of the context 
TU {y: TA} such that dom(T”) contains exactly all the variables 


m-+1 
that occur in A; 


. (abstraction from pfs) If (tÙ ,..., tgr )° is a predicative type, TH 


f: (ht), zr < z for all x € dom(T), and yı < -> < yn are 
the free variables of f, then 


a a Hi 1 
PP isen) : Canter T atm Prs , 


where I” is the subset of TU {z:(t]',... ,£&r)°} such that dom(T”) = 
{y Sini Yn Z}; 


. (weakening) If T, A are contexts, I C A, and T E f: tf, then also 


AF f:t% 


. (substitution) If y is the ¿th free variable in f (according to the order 


on variables), and TU{y: t°} + f : (ti... .,Ear), and TE A: tt? 
then , 

T“ H fig ki: EE : 
Here, b = 1 + max(aj,..., @i-1,@:41,---,@n,C), and 


c = max{j | Vz:t? occurs in f[y:=k]} 


(if n = 1 and {j | Vz:t? occurs in f[y:=k]} = Ø then take b = 0) and 
once more, I” is the subset of PU {y : t°} such that dom(T”) contains 
exactly all the variables that occur in f[y:=k]; 


. (permutation) If y is the ¿th free variable in f (according to the 


order on variables), and TU {yt} F f: (t{1,...,t@)*, and x < y 
for all z € dom(T), then 


Pe fly=y]: E, se hand est) 
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I” is the subset of TU {y:t;*, 9':t;*} such that domI” contains exactly 
all the variables that occur in f[y:=v'}; 


8. (quantification) If y is the ith free variable in f (according to the 
order on variables), and TU {y:t/*} Hf: (#77,...,¢@")*, then 


DES UNE), 


Ba2 ARTT 


Definition B.6 (Terms of ARTT, 4.3) Let A, V and R be as in Chapter 
2. Define the set 7 of terms of ARTT by: 


T u= | |On [ALV Re A| 
TT | AV:T.T WITT 


Definition B.7 (Derivation Rules for ARTT, 4.9) Let s,s1,32 range 
over S = [#,,D1,D5,...,*1,*2,... }, and let i 


R = FREE > 1}U 
{(Om, On, On) LS mn} 
(Gems #ns *max(mn)) mn 2 1} U 
{(*s,*ny#n) [rn 2 1} U 
LR enden | 1<m< n}. 


. The derivation rules for ARTT are as follows: 


(Axioms) Fen En (n > 1) 
Fick 
FL: 
Fte (a € A) 


Ritten (RER) 
u 
a(R) times « 


(Start) ri fer 


DM: N TH-A:s 
(Weak) ——TzArMm:N 


At: 
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(II-form) ICT Pre (s1, 52,53) E€ R 
iin) T,x:Atb:B TH (Ix:A.B):s 
(ist TF (Az:Ab) : (I:A.B) 

TEM : Iz:A.B TEN:A 


(Il-el) — TFMN:Be=N 
Peaeb “Tepe Beb 
TFA:B THB:s BB 
(Conv) TFA:B 
TEAS 
(Incl) Ta 


Bb AUTOMATH 


Bbi AUT-68 


Definition B.8 (Expressions, 5.1) We define the set € of AUT-68-er- 
pressions inductively: 


(variable) If x € V then z € £; 


(parameter) Ifa €C, n €N (n =0 is allowed) and %ı,... , En € E then 
anr sal E E; 


(abstraction) If x € V, E € E U {type} and Q € E then et: I]N € £; 
(application) If 5,22 € E then (£2)}1 € E. 
We define also E+ @eu {type}. 


Definition B.9 (Books and lines, 5.7) An AUT-68-book is a finite list 
(possibly empty) of (AUT-68)-lines (to be defined next). If Iı,... ‚I, are 
the lines of book B, we write B =1ı,... ‚In: 

An AUT-68-line is a 4-tuple (T;k;%1;%2). Here, 


e T is a context, ie. a finite (possibly empty) list zı:a1,... ,2n:@n, 
where the x;s are different elements of V and the a;s are elements of 
ET; 


e k is an element of VUC; 


e ©; can be (only): 
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o The symbol — (if k € V); 
o The symbol PN (if k € C); 
o An element of £ (if k € C); 


e SD, is an element of E+." 


Definition B.10 (Correct books and contexts, 5.10) A book ® and 
a context T are correct if B;TH OK can be derived with the following rules 


(axiom) Ø; Ø OK 


B, (T; z; —; a), B2; TF OK 
Bi, (l; 2; —; a), Ba; l, rra F OK 
B:T F OK 
B, (T; z; —; type); Ø F ok 
B:T F Ly: type 
B, (L; z; —; £2); Ø F OK 
. BT F OK 
(book ext.: pnl) B, EPN: type); Ø F OK 
. B; TH Uo: type 
(book ext.: pn2) B, U; k; PN; Dy); Ø F OK 
B;T F £ : type 
B, (T; k; Ly; type); Ø F OK 


(context ext.) 


(book ext.: varl) 


(book ext.: var2) 


(book ext.: defl) 


(book ext.: def2) 
BTE Xo: type BEY: 2, Bt Do ega DS 
B, (T; k; E1; Le); Ø F OK 


For the (book ext.) rules, we assume that the introduced identifiers x € V 
and k € C do not occur anywhere in B and T. 


Definition B.11 (Correct statements, 5.11) A statement B;I HX: Q 
is correct if it can be derived with the rules below (the start rule uses the 
notions of correct context and correct book as given in Definition 5.10). 
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B;Tı,r:a,Ta F OK 
(start) B;T,2r,T,F zo 


(parameters) 


B = Bi, (1:01, … ,En:Qn; b; 21; N2), Bo 
B;T F Di:aleı;- se ‚&-ı=3, Kuga RE = b ya ‚n) 


(abstr.1) 


(abstr.2 
B;TH%ı:t B;T,r:%ıF Ort BT, x: F Do: 


(application) 
(conversion) 


B;TH%:9 B;TH Oe:type BT E Ni =ga Qe 
; + 362 


When using the parameter rule, we assume that 8;[ F OK, even if n = 0. 


Bb2 „A68 
Definition B.12 (Terms, 5.21.1) The terms of A68 form a set T defined 
by 


T= V|C|S|TT | AV: TT | SV:T.T | IIV:T.T | QV:T.T, 
where S is the set of sorts {x,0, A}. 


Definition B.13 (Contexts, 5.21.2) We define the notion of context 
inductively: 


e 5; Ø is a context; DOM (Ø; Ø) = Ø; 


e If A;T is a context, x € V, x does not occur in A;T and A € T, then 
A;T,x:A is a context (x is a newly introduced variable); DOM (A; T) = 
DOM (A; TD) U {2}; 


e If A;T is a context, b € C, b does not occur in A;T and A € 7 then 
A,b:A;T is a context (in this case b is a primitive constant; cf. the 
primitive notions of AUTOMATH in Section 5al); DOM (A,b:A;T) = 
DOM (A; T') U {b}; 
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e If A;T is a context, b € C, b does not occur in A;T, A€ 7, andT ET, 
then A, b:=T:A;T is a context (in this case bis a defined constant; cf. 
the definitions of AUTOMATH in Section 5al); DOM (A, b:=T:A; T) = 
DOM (A; I) U {b}. 


Definition B.14 (Derivation rules, 5.21.3) 


(Axiom) ‚+: 0 
ATH A:s 
(Starti) ST AFA 


where s = *, 0O 
(Start: pc) AD Birdie 


where s = *,0 


(Start: dc) 


where sj = *,0 
ATH M:N ATH A:s 
(Weak: v) A;T,2z:AF M:N 


where s = *, 0 


A: ST.B: 82 


(Weak: pc) 
where sj = *, 0 


(Weak: dc) 
A; M:N A:TET:B: 8; A; {TB : 53 
A T 


‚»=@$T.T):(JT.B),;FM:N 
where sj = x, 0O 
AGT HA: * AT, z:4 bh B: * 
= SoS DER ER EEE Ba ote EE 
wen A;TF (Mz:A.B): « 
({-form) A:T HA: 81 A; T, z:A-B: $2 


A; T F (9z:A.B): A 
where sı = *, 0 
O) A;T H Tiz:A.B : * AT, xA F:B 
AGT F (Ar:A.F): (Ux:A.B) 
A:-TEM :IIz:4.B AGT HN: A 


(Appı) AGT FMN : Bir:=N 
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A;THM: 4z:4.B AGI N: A 


(Appa) A; TF MN: Ble=N] 
g ATEM:A A;TFKB:s AtA=gB 
(Cony) ETE MTB 


The newly introduced variables in the Start-rules and Weakening-rules are 
assumed to be fresh. Moreover, when introducing a variable x with a “pc”- 
rule or a “dc”-rule, we assume x € C, and when introducing x via a “v”-rule, 
we assume z € V. 


Be CD-PTSs and their subsystems 


Bc1 PTSs with parameters and definitions 


Definition B.15 (Terms, 6.1) The set Tp of parametric terms is defined 
together with the set Ly of lists of variables and the set Cr of lists of 
terms: 


Tp sm VIS TEL NIET AV ET] 
IIV:7p.7p | C(Ly)=Tp:Tp IN Tp; 

Ly == Ø|(Ly,V:Tp); 

Lr n= 8 | (Lr, Tp). 


where V is a set of variables, C is a set of constants, and S is a set of sorts. 
Definition B.16 The set of contezts is given by 

Cp = @|(Cp,V:Tp) | (Cp,C(Lv)=Tp:Tp) | (Cp,C(Ly):Tp). 
Definition B.17 (C: parametric constants, 6.20) The typing relation 


LČ is the smallest relation on Cp x Tp x Tp closed under the rules in 
Definition A.20 and the following ones (we write A = rı:Bı,...,2n:Bn): 


(Ö-weak) THeb:B DAF A:s 
T‚c(A): AH b: B 
T',c(A):A,T2 Fe beBele;=bi2 G= he) 
(Capp) Tı,c(A):A,Ta HC A:s (if n = 0) 


TQ), c(A):A, To He c(bı, Ban , On) : Alz;:=b;]}<ı 
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where s € S and the c that is introduced in the C-weakening rule is assumed 
to be T'-fresh. 


Definition B.18 (D: parametric definitions 6.21) The typing relation 
HP is the smallest relation on Cp x Tp x Tp closed under the rules in 
Definition A.20 and the following ones: 


= Dp. DD. 
(D-weak) bB TA urA 
T,c(A)=a:A H? b: 4 
Ti, c(A)=a:A, T3 HP b; : A E ER) 


(D-app) Ty, c(A)=a: A, ro HŽ a: Å (if n= 0) 
Ty, c(A)=a:A, Ta HP elbi,- be) : Alzj==bs ile 
(D-form) T,c(A)=a:A ey B: s 


TE? c(A)=za:A IN B:s 
T,c(A)=a:Ab? b:B THP c(A)=a:A IN B:s 
THP ¢(A)=a:A IN b: c(A)=a:A IN B 
THÖb:B TDHPB':s THB=B 
Tr 6: B! 
where s € S, and the c that is introduced in the D-weakening rule is 
assumed to be [-fresh. 


(D-intro) 


(D-conv) 


Definition B.19 (Pure Type Systems with (parametric) constants 
and (parametric) definitions, 6.22) Let G be a specification. 
e A pure type system with (parametric) constants C-PTS is denoted as 
AC(G) and consists of a set of terms 7p, a set of contexts Cp, the 
ß-reduction rule and the typing relation He, 


A pure type system with (parametric) definitions D-PTS is denoted 
as AP(6) and consists of a set of terms Tp, a set of contexts Cp, 8B 
and 6-reduction and the typing relation H?; 


e A pure type system with (parametric) constants and (parametric) def- 
initions CD-PTS is denoted as \CP (G) and consists of a set of terms 
Tp, a set of contexts Cp, ß and ö-reduction and the typing relation 
HED, which is the smallest relation on Cp x Tp x Tp that is closed 
under the rules of Definition A.20 and the rules of HC and HP. 
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Bc2 PTSs with restricted parameters and definitions 
Definition B.20 (Parametric Specification, 6.64) A parametric spec- 
ification is a quadruple (S, A, R, P) such that (S, A, R) is a specification 
(cf. Definition A.17), and PC S x (SU{ToP}). The parametric specifica- 
tion is called singly sorted if the specification (S, A, R) is singly sorted. 


Definition B.21 (C: restricted constants, 6. 65) Let (S, A, R, P) be 
a parametric specification. The typing relation HC is obtained from the 
relation HC by replacing rule (C-weak) by the following rule (C-weak): 


DHÊb:B TA: Biisi DAF Ars 


- (sis) € P. 
T,c(A): AFC 8: B 


Definition B.22 (D: restricted definitions, 6. 66) Let (S, A, R,P) 
be a parametric specification. The typing relation + is obtained from the 
relation H? by replacing rule (D-weak) by the following rules (D-weak): 


THÔb:B T, A;HÊ Biisi TAtPa:A:s 


= (sis) € P; 
T,e(A)=a: AP b: B 
THÊb:B TAtHPB:s; T,AtPa:A:Tor 
(si, TOP) E P. 


T,c(A)=a:AHP b: B 


Definition B.23 Let (S, A, R, P) be a parametric specification. The typ- 
ing relation Ree is obtained from the relation HC? by replacing rule (C- 
weak) by rule (C-weak) and rule (D-weak) by rules (D-weak). 


Definition B.24 (Pure Type Systems with Restricted Parameters 
and Restricted Parametric Definitions, 6.68) Let (S, A, R,P) bea 
parametric specification. The pure type system with restricted parameters 
and restricted parametric definitions (CD-PTS) and parametric specifica- 
tion G is denoted \°?(G). The system consists of the set of terms Tp, the 
set of contexts Cp, ß-reduction, ö-reduction, and the typing relation ÊD, 
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Summary 


In this thesis we provide insight in the evolution of the notion of type in 
logic and mathematics during the last one hundred and twenty years. We 
want to stress that we do more than merely giving a historical overview. 
We not only describe the type systems that have been developed in this 
period, but also describe them in a modern terminology. This terminology 
meets comtemporary requirements on formality and accuracy. In this way, 
the systems can be described within one framework, making a comparison 
between the systems possible. 

We chronologically follow the development of type theory, starting with 
Frege (1879). Some important, though less-known type systems are stud- 
ied. They are described in a modern terminology without violating the 
original philosophy behind them. This results in a modern and historically 
correct description of the various systems. We also discuss some important 
developments in type theory and their influence on modern type theory. 

The most important basics of current type theory, functional abstraction 
and function application, can already be found in the theories of Frege. His 
notion of abstraction is incorporated in Bertrand Russell's Ramified Type 
Theory (RrT) (1908) which was constructed as a solution for the logical 
paradoxes that arose at the turn of the century. The thesis provides a 
formalisation of RTT. It appears that the notion of function application is 
only introduced at an informal level in the original system. Using techniques 
of A-calculus we give an accurate definition of function application as is 
present in RTT. 

RTT has two hierarchies: one of types and one of orders. Ramsey (1926) 
shows that the logical paradoxes can also be avoided when using a simple 
type theory, without a hierarchy of orders. However, the thesis shows that 
orders still play an important role in logic. It describes a close relation 
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between the orders of RTT and the truth levels in Kripke’s Theory of Truth 
(1975). Kripke’s truth levels can be seen as a semantical interpretation of 
the notion of order in RTT. 

After having translated RTT in modern terminology, we describe RTT 
in so-called “propositions-as-types” style. This gives RTT a position in the 
framework of Pure Type Systems (PTSs), in which many modern type 
systems have already been classified. 

The type theory at the basis of the proof checker AUTOMATH has al- 
ready been described before in the PTS framework. However, no attention 
has been paid to the definition and parameter mechanisms, which play 
prominent roles in AUTOMATH. The thesis gives a detailed description of 
AUTOMATH. Then we extend the framework of PTSs with a parameter 
mechanism. This mechanism is constructed in such a way that it can be 
combined with the extension of PTSs with definitions as described by Severi 
and Poll. In the refined framework, PTSs with definitions and parameters, 
we not only classify various AUTOMATH systems, but also other important 
type systems, like LF and ML. 


Samenvatting 


Het proefschrift beoogt inzicht te geven in de ontwikkeling van het begrip 
type in logica en wiskunde in de afgelopen honderd-twintig jaar. Hierbij 
wordt nadrukkelijk meer gedaan dan (een vorm van) geschiedschrijving. 
Type-systemen die in deze periode zijn ontwikkeld worden dan ook niet 
alleen beschreven, ze worden ook vertaald naar een moderne terminologie, 
die voldoet aan de eisen die heden ten dage aan formele type-systemen 
gesteld worden. Daardoor kunnen deze systemen binnen een en hetzelfde 
kader worden geplaatst, zodat een onderlinge vergelijking tussen deze sys- 
temen mogelijk wordt. 

Het proefschrift volgt, dienend: de ontwikkeling van de type-theo- 
rie sinds Frege (1879). Een aantal belangrijke, doch minder bekende type- 
systemen wordt bestudeerd. Deze systemen worden beschreven in een te- 
genwoordig gebruikelijke terminologitegenwoordig gebruikelijke terminolo- 
gie, zonder dat de oorspronkelijke filosofie achter het systeem geweld wordt 
aangedaan. Hierdoor wordt een moderne, maar historisch verantwoorde 
beschrijving van de diverse type-systemen gegeven. Ook een aantal be- 
langrijke ontwikkelingen binnen de type-theorie wordt beschreven en hun 
invloed op de hedendaagse type-theorie wordt besproken. 

Het blijkt dat abstractie (een van de belangrijkste pijlers van de moderne 
type-theorie, functie-abstractie en functie-applicatie) reeds te vinden is in 
_ de theorie van Frege. Het abstractie-begrip van Frege wordt overgenomen 
door Bertrand Russell in diens Vertakte Type-theorie (vTT) (1908), die 
ontstaat als reactie op de logische paradoxen die rond de eeuwwisseling 
ontdekt werden. Het proefschrift geeft een formalisering van VTT. Met 
name het begrip functie-applicatie blijkt in het oorspronkelijke systeem 
slechts op informeel niveau aanwezig te zijn. Met behulp van technieken 
uit de moderne A-calculus kan een accurate formulering gegeven worden 
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van de functie-applicatie zoals die aanwezig is in VTT. 

VTT bestaat uit twee hiërarchieën: één van types en één van ordes. Door 
Ramsey (1926) wordt aangetoond dat de logische paradoxen ook vermeden 
kunnen worden met een enkelvoudige type-theorie, waarin geen hiërarchie 
van ordes zit. Het proefschrift laat echter zien dat het begrip orde nog 
steeds een belangrijke plaats inneemt in de logica. Het proefschrift legt 
een nauwkeurig verband tussen de ordes uit VTT en de waarheidsniveaus in 
Kripkes Theory of Truth (1975). Het blijkt dat Kripkes waarheidsniveaus 
kunnen worden gezien als een semantische interpretatie van het orde-begrip 
uit VTT. 

Behalve de vertaling van VTT in moderne terminologie beschrijft het 
proefschrift VTT ook in zogenaamde “propositions-as-types”-stijl. Hier- 
door krijgt VTT een plaats binnen het raamwerk van “Pure Type Systems” 
(PTSs), een framework waarbinnen reeds vele moderne type-systemen zijn 
geclassificeerd. 

De type-theorie die ten grondslag ligt aan de proof checker AUTOMATH 
is reeds eerder geplaatst in het raamwerk der PTSs, maar daarbij is geen 
aandacht geschonken aan het definitie-mechanisme en het parameter-me- 
chanisme, die in AUTOMATH prominent aanwezig zijn. In het proefschrift 
wordt eerst een nauwkeurige beschrijving van AUTOMATH gegeven. Daarna 
wordt het raamwerk van PTSs uitgebreid met een parameter-mechanisme. 
Dit raamwerk is zo opgesteld, dat de uitbreiding gecombineerd kan worden 
met de uitbreiding van PTSs met definities zoals omschreven door Severi en 
Poll. In het fijnere raamwerk dat zo verkregen wordt, PTSs met definities 
en parameters, worden niet alleen de verschillende AUTOMATH-systemen 
` geclassificeerd. Ook andere belangrijke type-systemen, zoals de systemen 
die ten grondslag liggen aan LF en ML, kunnen met dit fijnere raamwerk 
nauwkeuriger beschreven worden. 
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